Skip to content

FakeDataTelegramBot β€” Generate Fake Data (JSON / SQL / CSV)

πŸ”— View on GitHub

Telegram bot that interactively generates fake datasets and sends them back as a file.
Powered by Java 20, Maven, java-telegram-bot-api, Lombok, and JavaFaker.

Note: The current implementation supports JSON, SQL, and CSV. (β€œC” structs are not implemented in this codebase.)


✨ Features

  • Chat-based wizard using inline keyboards:
  • /generate β†’ File name β†’ Format (json/sql/csv) β†’ Row count β†’ Fields β†’ βœ”οΈ Generate.
  • Built-in field library (via FieldType enum) such as FULL_NAME, FIRST_NAME, AGE, CITY, BOOK_TITLE, CELL_PHONE, GENDER, etc.
  • Formats:
  • JSON: array of objects
  • SQL: INSERT INTO <file_name_format> statements
  • CSV: header + rows, commas safely quoted
  • Sends the generated file to the chat and deletes the temp file on disk.

🧱 Tech Stack

  • Java 20, Maven project (groupId=org.example, artifactId=FakeDataTelegramBot).
  • Dependencies:
  • com.github.pengrad:java-telegram-bot-api:6.8.0
  • org.projectlombok:lombok:1.18.28 (scope provided)
  • com.github.javafaker:javafaker:1.0.2

πŸ“¦ Project Structure & Key Classes

  • TelegramBotRunner: entry point that starts the bot, registers UpdatesListener, and dispatches updates on a thread pool. Reads bot.token from settings bundle.
  • TelegramBotUpdateHandler: main conversation handler implementing the step-by-step wizard, inline keyboards, file sending, and temp file cleanup.
  • FakeDataGenerator: core generator:
  • Maps each FieldType β†’ Supplier<Object> using Faker
  • Writers for json, sql, and csv
  • Returns a Path to the generated file
  • FieldType: enum of supported fields + helpers for quoting/formatting JSON & SQL values.
  • Pairs: (fieldName, fieldType) pair chosen by the user.
  • Request: generation request (fileName, count, type, pairs).
  • Main: a tiny demo that instantiates FakeDataGenerator (not used in production bot flow).

πŸ”§ Setup & Run

  1. Configure token
    Create a settings.properties on the classpath (e.g., in src/main/resources):
bot.token=YOUR_TELEGRAM_BOT_TOKEN
  1. Build
mvn clean package
  1. Run
java -cp target/FakeDataTelegramBot-1.0-SNAPSHOT.jar org.example.TelegramBotRunner

πŸ’¬ Usage (in Telegram)

  1. Start a chat with your bot and send:
/start
  1. Begin generation:
/generate
  1. Choose fields via inline keyboard
    Tap any number of fields (from FieldType enum).
    Hit βœ”οΈ Generate to create and receive the file.

  2. The bot sends the file and cleans up the temp file.


πŸ—‚ Supported Fields (FieldType)

FieldType Example / Notes
FULL_NAME "Aisha Karimova"
FIRST_NAME "Bek"
LAST_NAME "Yusupov"
USERNAME "Clara" (uses firstName)
TITLE job title
BLOOD_GROUP from name.bloodGroup()
WORDS faker lorem words
CHARACTERS single/random character
CITY country capital
COUNTRY country name
COUNTRY_CODE ISO-2 country code
ZIP_CODE postal code
BOOK_AUTHOR author
BOOK_GENRE genre
BOOK_PUBLISHER publisher
BOOK_TITLE title
CELL_PHONE phone number
AGE int 1–100
ID int 10000–99999
GENDER "Male" / "Female"

πŸ“„ Output Formats

JSON

  • Array of objects with selected fields.

SQL

  • Multiple INSERT INTO <file_name_format> (...) VALUES (...);

CSV

  • Header = selected field names; values quoted when they contain commas.

🧭 Conversation State Machine

State enum drives the wizard:

  • FILE_NAME β†’ TYPE β†’ ROW_COUNT β†’ FIELDS

🧩 Implementation Notes

  • Keyboard: Inline keyboard built from FieldType.values() (two columns), plus a final row β€œβœ”οΈ Generate”.
  • Concurrency: Updates are handled with CompletableFuture.runAsync on a fixed thread pool.
  • Cleanup: After sending SendDocument, the generated file is deleted via Files.delete(...).

🚦 Limitations & To-Dos

  • C structs output is not implemented (formats are json, sql, csv in code).
  • No per-field constraints (e.g., ranges for AGE) beyond defaults.
  • No schema persistence across sessions.
  • Minimal error messaging for invalid inputs.

Ideas to extend: - Add c formatter (struct + initializer).
- Add CREATE TABLE for SQL with type inference.
- Add locales and seed configuration commands.
- Persist last-used field sets per user.


πŸ›‘οΈ Safety

  • Generated data is synthetic.
  • Large row counts produce large files.
  • Bot token is read from settings.properties β€” keep it secret.

πŸ“œ License

Add your preferred license (e.g., MIT) to the repository.


πŸ‘©β€πŸ’» Development Quick Reference

  • Java 20 / Maven compiler configuration is set in the POM.
  • Libraries:
  • java-telegram-bot-api for Telegram integration
  • javafaker for data generation
  • lombok for boilerplate reduction