In this dataset, there have top bowlers who take almost wickets in their cricket career. Other features also evaluate a player's performance throughout their career along with wickets. Here are the features we can work with:
- Player: Player's name
- Span: Playing span or career duration of a player
- Mat: No. of matches played
- Inns: No. of innings bowled
- Balls: No. of balls bowled
- Runs: No. of runs conceded
- Wkts: Total no. of wickets taken
- BBI: BBI stands for Best Bowling in Innings and only gives the score for one innings,i.e.,9/51 means that 9 wickets for 51 runs allowed
- BBM: BBM stands for Best Bowling in Match and gives the combined score over 2 innings in one match
- Ave: Average (runs allowed per wicket taken)
- Econ: Economy rate (runs plus extras allowed per over)
- SR: Strike Rate (balls bowled per wicket taken)
- 5: number of times this bowler has taken five wickets in an innings
- 10: number of times this bowler has taken ten wickets in a match (over both innings of a test)
Data Analysis:
- Using Python's different bulit-in libraries
- Read different types of files with Pandas Dataframe (.csv file, .xlsx file, etc.)
Data Manipulation:
- Creating and naming the new data frame in Pandas
- Find the number of rows and columns in the dataframe
- Find the data statistics of the dataset
- Find the data types and missing values
- Rename column names
- Remove unnecessary columns
Data Preprocessing:
- Extract new informations from columns
- Creating a function based column
- Splitting a column into two new columns and removing the string from a column
- DataFrame sorting
- DataFrame slicing