**Resource Information**

In this assignment, you should work with books.csv file. This file contains the detailed information about books scraped via the Goodreads . The dataset is downloaded from Kaggle website: https://www.kaggle.com/jealousleopard/goodreadsbooks/downloads/goodreadsbooks.zip/6

Each row in the file includes ten columns. Detailed description for each column is provided in the following:

**bookID**: A unique Identification number for each book.**title**: The name under which the book was published.**authors**: Names of the authors of the book. Multiple authors are delimited with -.**average_rating**: The average rating of the book received in total.**isbn**: Another unique number to identify the book, the International Standard Book Number.**isbn13**: A 13-digit ISBN to identify the book, instead of the standard 11-digit ISBN.**language_code**: Helps understand what is the primary language of the book.**num_pages**: Number of pages the book contains.**ratings_count**: Total number of ratings the book received.**text_reviews_count**: Total number of written text reviews the book received.

**Task**

- Write the following codes:
- Use pandas to read the file as a dataframe (named as books).
**bookID**column should be the index of the dataframe. - Use books.head() to see the first 5 rows of the dataframe.
- Use book.shape to find the number of rows and columns in the dataframe.
- Use books.describe() to summarize the data.
- Use books[‘authors’].describe() to find about number of unique authors in the dataset and also most frequent author.
- Use OLS regression to test if average rating of a book is dependent to number of pages, number of ratings, and total number of written text reviews the book received.

- Summarize your findings in a Word file.

**Instructions**

#### **Please follow these directions carefully.**

- Please type your codes in a Jupyter Network file and your summary in a word document named as follows:

HW6YourFirstNameYourLastName.