Skip to content

Commit

Permalink
Merge pull request #1 from creditdatamw/feature/sql-export
Browse files Browse the repository at this point in the history
Adds export to SQL INSERT statements functionality
  • Loading branch information
zikani03 committed Sep 3, 2020
2 parents f959535 + d2a5155 commit 7144a09
Show file tree
Hide file tree
Showing 10 changed files with 348 additions and 115 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,6 @@
/.idea/

# Ignore Gradle build output directory
build
build
*.xlsx
/*.sql
33 changes: 24 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
Zé Faker
========

`zefaker` is a command-line tool that allows you to generate Excel files
`zefaker` is a command-line tool that allows you to generate Excel and SQL files
using a simple Groovy DSL and [java-faker](https://github.com/DiUS/java-faker)

## Why would I use this?

Well, if you ever need to generate an Excel file with (random*) data for whatever
reason you can use `zefaker` to automate the process while leveraging the power of
[Groovy](https://www.groovy-lang.org)!
Well, if you ever need to generate an Excel file or SQL insert statemnts
with (random*) data for whatever reason you can use `zefaker` to automate the
process while leveraging the power of [Groovy](https://www.groovy-lang.org)!

We created it because we deal with a lot of Excel files (with lots of columns!)
We created it because we deal with a lot of Excel files and SQL (with lots of columns!)
and often have to generate files to test the code that processes those files.

_* the generated data need not necessarily be random_
Expand Down Expand Up @@ -42,12 +42,24 @@ generateFrom columns

Once you have this, you can pass it to the `zefaker` command to generate an Excel file:

**Exporting to Excel**

```sh
$ java -jar zefaker.jar -f=person.groovy -sheet="Persons" -rows=100 -output=people.xlsx
```

The example command, above, generates a file named **people.xlsx** with a **100 rows** populated
with data generated using the faker methods specified in the groovy script.
with data generated using the Faker functions specified in the Groovy script.

**Exporting to SQL INSERTS**

```sh
$ java -jar zefaker.jar -f=person.groovy -sql -table="people" -rows=100 -output=people-data.sql
```

The example command, above, generates a file named **people-data.sql** with a
**100 INSERT statements** which have random data in teh _VALUES_ clause
generated using the Faker functions specified in the Groovy script.

_Bonus / Shameless plug_: If you're using Java, you can process the generated files _quickly_ and
_efficiently_ using [zerocell](https://github.com/creditdatamw/zerocell).
Expand All @@ -63,11 +75,13 @@ Download a copy of `zefaker` from the [Releases](https://github.com/creditdatamw
### Command Line

```sh
Usage: zefaker [-x] [-vvv] -f=FILE -output=FILE [-rows=ROWS] [-sheet=SHEET]
Usage: zefaker [options]
-f=FILE Groovy file with column definitions
-output=FILE File to write to, e.g. generated.xlsx
-rows=ROWS Number of rows to generate
-sheet=SHEET Sheet name in the generated Excel file
-output=FILE File to write to, e.g. generated.xlsx
-sheet=NAME Sheet name in the generated Excel file
-table=NAME The name of the table to use in SQL INSERT mode
-sql Use SQL INSERT export mode
-vvv Show verbose output
-x Overwrite existing file
```
Expand Down Expand Up @@ -103,6 +117,7 @@ specified on the command-line.
The following special variables are available, and are therefore *reserved names*:

* **sheetName** - Change the name of the target Sheet in Excel. Overrides `-sheet`
* **tableName** - Change the name of the target table in SQL INSERTS. Overrides `-table`
* **outputFile** - The name/path of the file to write output to. Overrides `-f`
* **verbose** - Show verbose output. Overrides `-vvv`
* **maxRows** - Sets the maximum number of rows to generate in the file. Overrides `-rows`
Expand Down
15 changes: 10 additions & 5 deletions example.groovy
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
firstName = column(index= 0, name= "Firstname")
lastName = column(index= 1, name= "Last Name")
age = column(index= 2, name= "Age")
// Uncomment one of the quoteIdenfiersAs lines below to add column quoting for SQL exports
// quoteIdentifiersAs("mysql")
// quoteIdentifiersAs("postgres")
// quoteIdentifiersAs("mssql")

accountStatus = column(index=3, name="Account Status")
firstName = column(index= 0, name= "first_name")
lastName = column(index= 1, name= "last_name")
age = column(index= 2, name= "age")

accountStatus = column(index=3, name="account_status")

columns = [
(firstName): { faker -> faker.name().firstName() },
(firstName): { faker -> faker.name().firstName() },
(lastName): { faker -> faker.name().lastName() },
(age): { faker -> faker.number().numberBetween(18, 70) },
(accountStatus): { faker -> faker.options().option("Open", "Closed") }
Expand Down
2 changes: 1 addition & 1 deletion examples/multi_configuration.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ generateFrom([
(column(index= 0, name="Account No.")): { faker -> faker.number().numberBetween(1, 200) },
(column(index= 1, name="Company Name")): { faker -> faker.name.fullName() + faker.options().option("Plc", "Pvt Ltd", "") },
(column(index= 2, name="TPIN")): { faker -> "TPIN" + faker.number().numberBetween(1, 200) },
(column(index= 3, name="Registration Date")): { faker -> "197001-01" },
(column(index= 3, name="Registration Date")): { faker -> "1970-01-01" },
(column(index= 4, name="Postal Address")): { faker -> "P.O. Box 123" },
(column(index= 5, name="Telephone")): { faker -> "265999" + faker.number().numberBetween(111111, 999999) }
])
Expand Down
13 changes: 13 additions & 0 deletions src/main/groovy/zefaker/ColumnDef.groovy
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
package zefaker

class ColumnDef {
int index
String name
Closure faker

public ColumnDef(int index, String name, Closure faker) {
this.index = index
this.name = name
this.faker = faker
}
}
8 changes: 8 additions & 0 deletions src/main/groovy/zefaker/ColumnQuotes.groovy
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
package zefaker

enum ColumnQuotes {
NONE,
MSSQL,
MYSQL,
POSTGRESQL
}
105 changes: 105 additions & 0 deletions src/main/groovy/zefaker/ExcelFileGenerator.groovy
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
package zefaker

import com.github.javafaker.Faker
import org.apache.poi.ss.usermodel.Cell
import org.apache.poi.ss.usermodel.Row
import org.apache.poi.ss.usermodel.Sheet
import org.apache.poi.ss.usermodel.Workbook
import org.apache.poi.ss.util.WorkbookUtil
import org.apache.poi.xssf.usermodel.XSSFWorkbook
import org.apache.poi.xssf.streaming.SXSSFWorkbook;

import java.nio.file.Paths
import java.nio.file.Files
import java.util.stream.Collectors
import java.util.concurrent.CountDownLatch
import java.util.concurrent.atomic.AtomicLong

class ExcelFileGenerator implements Runnable {
Workbook wb
def columnDefs
def filePath
def maxRows
def sheetName = "Sheet 1"
final CountDownLatch latch
final Faker faker
final AtomicLong generated = new AtomicLong(0)

ExcelFileGenerator(faker, filePath, columnDefs, sheetName, streamingBatchSize, maxRows, latch) {
this.faker = faker
this.filePath = filePath
this.sheetName = sheetName
this.wb = new SXSSFWorkbook(streamingBatchSize)
this.columnDefs = columnDefs
this.latch = latch
this.maxRows = maxRows
}

void run() {

try {
def fos = Files.newOutputStream(filePath)
def sheet = this.wb.createSheet(WorkbookUtil.createSafeSheetName(sheetName))
def dateCellStyle = this.wb.createCellStyle()

dateCellStyle.setDataFormat(
// TODO: Enable user to specify a date format in the script
this.wb.getCreationHelper().createDataFormat().getFormat("yyyy-mm-dd")
);

// Create file headers
def row = sheet.createRow(0)
int i = 0

columnDefs.keySet().each {
def cell = row.createCell(it.index)
cell.setCellValue(it.name)
// TODO: if(s.contains("DATE")) cell.setCellStyle(dateCellStyle);
++i;
}

try {
populateSheet(sheet, columnDefs)
} catch(Exception e) {
System.err.println("ERROR: Exception during file processing: " + e.getMessage())
} finally {
wb.write(fos)
fos.close()

wb.dispose() // remove temporary files
wb.close()
sheet = null
}

latch.countDown()

} catch (IOException e) {
latch.countDown()
throw new RuntimeException("Failed to generate file", e)
}
}

/**
* Populate the Sheet using the given column definitions
* @param sheet The sheet to write to
* @param columnDefs map of column definitions
*/
void populateSheet(sheet, columnDefs) {
int nextRow = 1
Row row = null

while(generated.get() < maxRows) {
row = sheet.createRow(nextRow)

columnDefs.each {
def col = it.getKey()
def fakerFunc = it.getValue()
def generatedValue = fakerFunc(faker)
row.createCell(col.index).setCellValue(generatedValue)
}

nextRow++;
generated.incrementAndGet();
}
}
}
8 changes: 7 additions & 1 deletion src/main/groovy/zefaker/Main.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ cli.x(type: Boolean, defaultValue: 'false', 'Overwrite existing file')
cli.output(type: String, required: true, 'File to write to, e.g. generated.xlsx')
cli.sheet(type: String, defaultValue: 'Data', 'Sheet name in the generated Excel file')
cli.rows(type: Integer, defaultValue: '10', 'Number of rows to generate')
cli.table(type: String, defaultValue: 'Data', 'Table name in the generated SQL file')
cli.sql(type: Boolean, defaultValue: 'false', 'Export as SQL INSERTS instead of Excel')
cli.vvv(type: Boolean, defaultValue: 'false', 'Show verbose output')

def options = cli.parse(args)
Expand All @@ -27,9 +29,13 @@ def groovyShell = new GroovyShell(this.class.classLoader, binding, config)

binding.setProperty("faker", new Faker())
binding.setProperty("verbose", options.vvv)
binding.setProperty("sheetName", options.sheet)
binding.setProperty("maxRows", options.rows)
binding.setProperty("outputFile", options.output)
binding.setProperty("overwriteExisting", options.x)
// Options for the Excel output
binding.setProperty("sheetName", options.sheet)
// Options for the SQL output
binding.setProperty("tableName", options.table)
binding.setProperty("exportAsSql", options.sql)

groovyShell.evaluate(options.f)
115 changes: 115 additions & 0 deletions src/main/groovy/zefaker/SqlFileGenerator.groovy
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
package zefaker

import com.github.javafaker.Faker

import java.nio.file.Paths
import java.nio.file.Files
import java.io.BufferedWriter
import java.util.stream.Collectors
import java.util.concurrent.CountDownLatch
import java.util.concurrent.atomic.AtomicLong

class SqlFileGenerator implements Runnable {
def columnDefs
def filePath
def tableName = "data"
def maxRows = 10
def quoteMode = ColumnQuotes.NONE
final CountDownLatch latch
final Faker faker
final AtomicLong generated = new AtomicLong(0)

final VALUES_PLACEHOLDER = "__values__"

SqlFileGenerator(faker, filePath, columnDefs, tableName, maxRows, latch) {
this.faker = faker
this.filePath = filePath
this.columnDefs = columnDefs
this.tableName = tableName
this.latch = latch
this.maxRows = maxRows
}

void setQuoteMode(quoteMode) {
this.quoteMode = quoteMode
}

void run() {
StringBuilder sb = new StringBuilder()
sb.append("INSERT INTO ")
.append(tableName)
.append(" (")
// TODO: consider order of the columns?
.append(columnDefs.keySet()
.stream()
.map({ it ->
switch(quoteMode) {
case ColumnQuotes.MSSQL:
return String.format("[%s]", it.name)
case ColumnQuotes.MYSQL:
return String.format("`%s`", it.name)
case ColumnQuotes.POSTGRESQL:
return String.format("\"%s\"", it.name)
case ColumnQuotes.NONE:
default:
return it.name
}
})
.collect(Collectors.joining(",")))
.append(") ")
.append("VALUES (")
.append(VALUES_PLACEHOLDER)
.append(");");

def sqlInsertTemplate = sb.toString()

def bufferedWriter = Files.newBufferedWriter(filePath)

try {
def rowValues = new Object[columnDefs.size()]

while(generated.get() < maxRows) {
columnDefs.each {
def col = it.getKey()
def fakerFunc = it.getValue()
def generatedValue = fakerFunc(faker)
rowValues[col.index] = String.valueOf(generatedValue)
}

def sqlStatement = createInsertStatement(sqlInsertTemplate, rowValues)
bufferedWriter.write(sqlStatement)
bufferedWriter.newLine()

generated.incrementAndGet();
}

bufferedWriter.flush();

} catch (Exception e) {
bufferedWriter.close()

throw new RuntimeException("Failed to generate file", e)
} finally {
latch.countDown()
}
}

String createInsertStatement(sqlTemplate, rowValues) {
//def rowValuesQuotesReplaced = rowValues.map {
//
// return it
//}
// use rowValuesQuotesReplaced
def valuesString = Arrays.stream(rowValues)
.map({ it ->
if (it == null) return it
if (it instanceof String) {
return String.format("'%s'", it.replace("'", "\\'"))
}
return String.format("'%s'", it)
})
.collect(Collectors.joining(","))

return sqlTemplate.replace(VALUES_PLACEHOLDER, valuesString)
}
}
Loading

0 comments on commit 7144a09

Please sign in to comment.