====== BeautifulSoup のインストール ======
===== Python 仮想環境 =====
Python バージョン確認\\
$ python -VV
Python 3.7.7 (default, Mar 13 2020, 10:23:39)
[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)]
Python 仮想環境の作成\\
$ python -m venv py3_scraping
Python 仮想環境の有効化\\
$ . py3_scraping/bin/activate
(py3_scraping) $
Python 仮想環境の pip を更新\\
(py3_scraping) $ python -m pip install --upgrade pip
Collecting pip
Downloading https://files.pythonhosted.org/packages/54/2e/df11ea7e23e7e761d484ed3740285a34e38548cf2bad2bed3dd5768ec8b9/pip-20.1-py2.py3-none-any.whl (1.5MB)
|████████████████████████████████| 1.5MB 2.0MB/s
Installing collected packages: pip
Found existing installation: pip 19.1.1
Uninstalling pip-19.1.1:
Successfully uninstalled pip-19.1.1
Successfully installed pip-20.1
===== インストール =====
BeautifulSoup をインストールするには、以下のコマンドを実行する。\\
$ pip install beautifulsoup4
Collecting beautifulsoup4
Downloading beautifulsoup4-4.9.0-py3-none-any.whl (109 kB)
|████████████████████████████████| 109 kB 1.9 MB/s
Requirement already satisfied: soupsieve>1.2 in ./py3_scraping/lib/python3.7/site-packages (from beautifulsoup4) (2.0)
Could not build wheels for soupsieve, since package 'wheel' is not installed.
Installing collected packages: beautifulsoup4
Successfully installed beautifulsoup4-4.9.0
必要に応じて requests をインストールする。\\
$ pip install requests
Collecting requests
Downloading requests-2.23.0-py2.py3-none-any.whl (58 kB)
|████████████████████████████████| 58 kB 1.9 MB/s
Collecting chardet<4,>=3.0.2
Downloading chardet-3.0.4-py2.py3-none-any.whl (133 kB)
|████████████████████████████████| 133 kB 6.0 MB/s
Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1
Downloading urllib3-1.25.9-py2.py3-none-any.whl (126 kB)
|████████████████████████████████| 126 kB 14.0 MB/s
Collecting idna<3,>=2.5
Downloading idna-2.9-py2.py3-none-any.whl (58 kB)
|████████████████████████████████| 58 kB 24.3 MB/s
Collecting certifi>=2017.4.17
Downloading certifi-2020.4.5.1-py2.py3-none-any.whl (157 kB)
|████████████████████████████████| 157 kB 12.5 MB/s
Installing collected packages: chardet, urllib3, idna, certifi, requests
Successfully installed certifi-2020.4.5.1 chardet-3.0.4 idna-2.9 requests-2.23.0 urllib3-1.25.9
必要に応じて pandas をインストールする。\\
$ pip install pandas
Collecting pandas
Downloading pandas-1.0.3-cp37-cp37m-manylinux1_x86_64.whl (10.0 MB)
|████████████████████████████████| 10.0 MB 2.1 MB/s
Collecting numpy>=1.13.3
Downloading numpy-1.18.3-cp37-cp37m-manylinux1_x86_64.whl (20.2 MB)
|████████████████████████████████| 20.2 MB 12.2 MB/s
Collecting python-dateutil>=2.6.1
Downloading python_dateutil-2.8.1-py2.py3-none-any.whl (227 kB)
|████████████████████████████████| 227 kB 12.7 MB/s
Collecting pytz>=2017.2
Downloading pytz-2020.1-py2.py3-none-any.whl (510 kB)
|████████████████████████████████| 510 kB 10.2 MB/s
Collecting six>=1.5
Downloading six-1.14.0-py2.py3-none-any.whl (10 kB)
Installing collected packages: numpy, six, python-dateutil, pytz, pandas
Successfully installed numpy-1.18.3 pandas-1.0.3 python-dateutil-2.8.1 pytz-2020.1 six-1.14.0
===== 参考文献 =====
[[https://gammasoft.jp/blog/difference-find-and-select-in-beautiful-soup-of-python/|Beautiful Soup のfind_all( ) と select( ) の使い方の違い | ガンマソフト株式会社]]\\
[[https://akatak.hatenadiary.jp/entry/2018/08/12/093126|Pythonによる不動産情報のデータ取得&分析(1)【賃貸物件/Webスクレイピング編】 - akatak’s blog]]\\