Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create_sample function #8

Open
luminousmen opened this issue Sep 24, 2023 · 1 comment
Open

Create_sample function #8

luminousmen opened this issue Sep 24, 2023 · 1 comment

Comments

@luminousmen
Copy link
Owner

luminousmen commented Sep 24, 2023

We need to develop a create_sample function that can generate sample data for testing, experimentation, or demonstration purposes. This function will generate synthetic data that closely mimics real data for various use cases.

AC:

  • Create a create_sample function for both Avro and Parquet util classes that can generate synthetic data based on a specified data schema or structure. Interface
def create_sample(schema_path: Path, n: Int):
    ...
  • Implement options to specify the number of records or rows to generate in the sample data.
  • Ensure that the generated sample data adheres to the provided data schema or structure.
  • Provide flexibility to generate both structured and semi-structured data, including support for nested structures if applicable.
  • It should support existing data sample files from the tests/data directory
  • Write unit and integration tests to validate the correctness of the function.
  • Integrate the function into the interface
  • Ensure the function is easy to use with a clear and well-documented API.
@Mukku27
Copy link

Mukku27 commented Oct 17, 2024

@luminousmen i would like to work on it please assign it to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants