Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YouTube video about SQL Index #5

Open
francesco1119 opened this issue Aug 9, 2020 · 3 comments
Open

YouTube video about SQL Index #5

francesco1119 opened this issue Aug 9, 2020 · 3 comments

Comments

@francesco1119
Copy link

I'm watching this YouTube video, is amazing but there are no instructions about how to generate the random table.

Please give us the instructions and mention them in the video.

Thank you

@davidlpotter59
Copy link

I am also looking for the python code. The last comment was on August 9, 2020 and no response. It is a great video series by the way.

@OssBozier
Copy link

OssBozier commented Jun 22, 2023

I've written an Excel based version. Not pretty but works. You will need to create your own connection to your DB.

GernerateNames.zip

I also thought perhaps I should try writing that python script too, so here it is. This creates an SQL file to import:
GenerateNamesPY.zip

import csv
import random
from datetime import datetime

NUMBER_OF_RECORDS_TO_CREATE = 100000

female_path = "S:\CSV\female_names.csv"
male_path = "S:\CSV\male_names.csv"
last_path = "S:\CSV\last_names.csv"

file = open (female_path, newline='')
female_reader = csv.reader(file)
female_names = [row[1] for row in female_reader]

file = open (male_path, newline='')
male_reader = csv.reader(file)
male_names = [row[1] for row in male_reader]

file = open (last_path, newline='')
last_reader = csv.reader(file)
last_names = [row[1] for row in last_reader]
# Prompt the user for the last ID
while True:
try:
last_id = int(input("Please enter the last person_id in your table [SELECT max(person_id) FROM person;]: "))
break # If the input is a valid integer, break out of the loop
except ValueError:
continue

with open('S:\CSV\persons.sql', 'w') as file:
# Write the initial SQL to create the table
file.write("""
CREATE TABLE IF NOT EXISTS person (
person_id int(11) NOT NULL AUTO_INCREMENT,
first_name varchar(40),
last_name varchar(40),
DOB date,
PRIMARY KEY (person_id)
) ;
""")

# Write the INSERT INTO statement for each person
file.write("INSERT INTO `person` (`person_id`, `first_name`, `last_name`, `DOB`) VALUES\n")

for i in range(last_id + 1, last_id + NUMBER_OF_RECORDS_TO_CREATE+1):
    # Choose a random first name (male or female)
    first_name = random.choice(female_names + male_names)

    # Choose a random last name
    last_name = random.choice(last_names)

    # Choose a random year between 1950 and 2015, and a random month and day
    dob = datetime(random.randint(1950, 2015), random.randint(1, 12), random.randint(1, 28))

    # Write the values into the file
    file.write(f"\t({i}, '{first_name}', '{last_name}', '{dob.strftime('%Y-%m-%d')}'),\n")

# Seek back one character to overwrite the last comma with a semicolon
file.seek(file.tell() - 3, 0)
file.write(";\n")

@OssBozier
Copy link

OssBozier commented Jun 22, 2023

I also translated it to Java:
GenerateNamesJava.zip

If you can't get that to work, here is an SQL script with 2 million random names in it: It's limited by the 25MB file limit unfortunately.

PersonRecords2m.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants