> For the complete documentation index, see [llms.txt](https://simian114.gitbook.io/blog/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://simian114.gitbook.io/blog/undefined/undefined-1/04.-leetcode-819.md).

# 04. 가장 흔한 단어(leetcode: 819)

{% hint style="info" %}
[**문제링크**](https://leetcode.com/problems/most-common-word/)

[**문자열에서 구두점 없애는 방법 4가지**](https://www.delftstack.com/ko/howto/python/how-to-strip-punctuation-from-a-string-in-python/)

[**리스트 컴프리헨션**](https://wikidocs.net/22805)

[**리스트 컴프리헨션 연습1**](https://www.learnpython.org/en/List_Comprehensions)

[**리스트 컴프리헨션 연습2**](https://gist.github.com/doughsay/cb50d9e4d344230ebc166255a202f81d)

[**파이썬 Counter**](https://www.daleseo.com/python-collections-counter/)

[**파이썬 collections**](https://docs.python.org/3/library/collections.html)
{% endhint %}

## 분석

인자로 문자열과, 금지된 단어의 배열이 있다.

먼저 인자를 출력해보자.

```
"Bob hit a ball, the hit BALL flew far after it was hit."
["hit"]
```

보다시피 인자는 문자열에는 **대문자** 와 **구두점**이 들어있다. 따라서 이에 대한 처리가 우선되어야한다.

### 대문자를 소문자로

문자열 메서드 `str.lower()` 를 사용하자

### 구두점을 없애는 방법

1. 정규식으로 없애보자

   ```python
   after = re.sub(r'[^\w\s]', '', paragraph)
   ```

   문자열 paragraph 에서 단어와 공백을 제외한 나머지를 전부 없앤다.

문자열에서 구두점까지 없앴으면 이제 이 문자열을 split 해서 배열로 만들자. 아 이전에 lower은 잊지말고

### 문자열을 배열로

```python
words = after.lower().split(' ')
```

### 하지만 우리는 인자로 banned 배열이 주어졌고 해당 배열에 없는 단어만 챙겨야 한다.

즉, 우리가 만들어야 하는 단어의 배열 **words** 에는 banned 에는 없는 단어만 넣어야한다. 이 경우에는 **리스트 컴프리헨션** 이 아주 유용하게 사용된다.

```python
words = [word for word in after.lower().split(' ') if word not in banned]
```

### 단어의 배열을 Counter 객체로 만들자

우리는 **가장 흔한 단어** 를 찾아야한다. 가장 흔한 단어를 찾기 위해서는 배열에서 수를 세야한다. Map 을 만들고 반복문을 통해서 하나하나 증가시킬 수도 있지만 **Counter** 객체를 이용하면 금방 해결할 수 있다.

```python
counter = Collections.Counter(words)
return counter.most_common(1)[0][0] # 가장 흔한거 하나만 찾고 그 키 값을 찾는다
```

## 풀이

위에서 필요한 모든것을 알아봤으므로 이제 문제를 풀자.

1. 들어온 문자열에서 구두점을 없앤다.
2. 문자열을 배열로 만들자. 이 때 banned 에 존재하는 단어는 포함시키지 말자
3. Counter 객체로 만들고 **most\_common** 메서드를 이용해서 가장 흔한 단어 하나만 찾자.

```python
class Solution:
    def mostCommonWord(self, paragraph: str, banned: List[str]) -> str:     
        words = [word for word in re.sub(r'[^\w\s]', ' ', paragraph)
                 .lower().split() if word not in banned]
        counter = collections.Counter(words)
        return counter.most_common(1)[0][0]
```