Ecommerce Website Transaction

Description

The "Ecommerce Website Transaction" project is a collaborative effort between two teams to simulate and analyze ecommerce transaction data. Both teams generate a vast amount of simulated data based on a predefined schema, stream it to each other, clean and transform the data, and finally perform analytical queries. The results are visualized using Zeppelin and presented to an audience with diverse backgrounds.

Our team's primary objective is to analyze the other team's data, finding trends and patterns that can provide valuable insights. Part of our simulation process also involves generating "bad data" by selecting specific columns and replacing them with unrelated data, challenging the data cleaning and transformation process.

Click here to see the demo

Schema

The schema used to generate the transaction data includes:

order_id: Order ID
customer_id: Customer ID
customer_name: Customer Name
product_id: Product ID
product_name: Product Name
product_category: Product Category
payment_type: Payment Type
qty: Quantity Ordered
price: Price of Product
datetime: Date & Time when Order was Placed
country: Customer Country
city: Customer City
ecommerce_website_name: Site where Order was Placed
payment_txn_id: Payment Transaction ID
payment_txn_success: Payment Success/Failure
failure_reason: Reason for Payment Failure

Features

Generates over 2 million rows of transaction data spanning 10 years.
Uses base data on products, companies, and customers stored in files for transaction generation.
Highly customizable data generation.
Generates customers from over 20 different countries with region-accurate names.
Converts transaction prices from USD to the customer's local currency.
Introduces bad data at a rate of 3% for testing and validation.
Simulates logistic growth for each company at different rates.

Workflow

Both teams generate transaction data based on the schema.
Data is streamed to the opposite team via Kafka and stored in a CSV file.
Each team cleans and transforms the received data.
Analytical queries are performed on the cleaned data.
Results are visualized using Tableau and Zeppelin.
Teams come together to share findings and present to a mixed audience.

Technologies

Apache Spark
Spark SQL
Kafka
Scala 2.12.11
Zeppelin

Contributors

A big thank you to all our contributors who made this project possible:

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
documents		documents
src		src
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ecommerce Website Transaction

Description

Schema

Features

Workflow

Technologies

Contributors

About

Releases

Packages

Contributors 8

Languages

NewyorkMengHer/Ecommerce-Website-Transaction

Folders and files

Latest commit

History

Repository files navigation

Ecommerce Website Transaction

Description

Schema

Features

Workflow

Technologies

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Languages

Packages