Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with Whoscored scraper #698

Closed
Oktay7v2 opened this issue Sep 5, 2024 · 1 comment
Closed

Problem with Whoscored scraper #698

Oktay7v2 opened this issue Sep 5, 2024 · 1 comment
Labels
bug Something isn't working duplicate This issue or pull request already exists WhoScored Issue or pull request related to the WhoScored scraper

Comments

@Oktay7v2
Copy link

Oktay7v2 commented Sep 5, 2024

Describe the bug
When i'm trying to run the simple line of code sd.Whoscored(league etc.) it gaves problem with the translation of the leagues names, i've tried to make some changes in the config.py file, and it seem to work, but when i run every other line of code as the one for read the schedules or for the api the chromedriver goes on whoscored and does nothing. I think the problem can regard the language of the site, when the chromedriver opens whoscored it goes directly in italian and maybe it blocks the scraping method but im not sure about it.

Affected scrapers
WhoScored

Code example

import soccerdata as sd

ws = sd.WhoScored(leagues="ENG-Premier League", seasons=2021, no_cache=True)

epl_schedule = ws.read_schedule()
epl_schedule.head()

Error message

KeyError                                  Traceback (most recent call last)
Cell In[20], [line 6](vscode-notebook-cell:?execution_count=20&line=6)
      [3](vscode-notebook-cell:?execution_count=20&line=3) ws = sd.WhoScored(leagues="ENG-Premier League", seasons=2021, no_cache=True)
      [4](vscode-notebook-cell:?execution_count=20&line=4) print(ws.__doc__)
----> [6](vscode-notebook-cell:?execution_count=20&line=6) epl_schedule = ws.read_schedule()
      [7](vscode-notebook-cell:?execution_count=20&line=7) epl_schedule.head()

File c:\Users\oktay\AppData\Local\Programs\Python\Python310\lib\site-packages\soccerdata\whoscored.py:344, in WhoScored.read_schedule(self, force_cache)
    [331](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:331) def read_schedule(self, force_cache: bool = False) -> pd.DataFrame:
    [332](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:332)     """Retrieve the game schedule for the selected leagues and seasons.
    [333](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:333) 
    [334](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:334)     Parameters
   (...)
    [342](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:342)     pd.DataFrame
    [343](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:343)     """
--> [344](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:344)     df_season_stages = self.read_season_stages(force_cache=force_cache)
    [345](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:345)     filemask_schedule = "matches/{}_{}_{}_{}.json"
    [347](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:347)     all_schedules = []

File c:\Users\oktay\AppData\Local\Programs\Python\Python310\lib\site-packages\soccerdata\whoscored.py:274, in WhoScored.read_season_stages(self, force_cache)
    [261](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:261) def read_season_stages(self, force_cache: bool = False) -> pd.DataFrame:
    [262](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:262)     """Retrieve the season stages for the selected leagues.
    [263](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:263) 
    [264](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/soccerdata/whoscored.py:264)     Parameters
...
-> [6249](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/pandas/core/indexes/base.py:6249)         raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   [6251](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/pandas/core/indexes/base.py:6251)     not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
   [6252](file:///C:/Users/oktay/AppData/Local/Programs/Python/Python310/lib/site-packages/pandas/core/indexes/base.py:6252)     raise KeyError(f"{not_found} not in index")

KeyError: "None of [Index(['ENG-Premier League'], dtype='object', name='league')] are in the [index]"

Contributor Action Plan
I’m not able to fix this issue.

@Oktay7v2 Oktay7v2 added the bug Something isn't working label Sep 5, 2024
@probberechts
Copy link
Owner

This is a duplicate of #440. In #660, @Messe57 suggested that the following works: #440 (comment)

@probberechts probberechts added duplicate This issue or pull request already exists WhoScored Issue or pull request related to the WhoScored scraper labels Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working duplicate This issue or pull request already exists WhoScored Issue or pull request related to the WhoScored scraper
Projects
None yet
Development

No branches or pull requests

2 participants