Apply on
Original
Simplified
Data Modeling Related
Minimum Qualifications:
- Design and develop physical, logical, and conceptual levels data models for structured and unstructured data, batch and real-time data processing for AI/GenAI project.
- Evaluate the performance of data systems and implement data strategies.
- Identify, track, and resolve data-related issues and malfunctions.
- Analyze and evaluate data systems and models for efficiency, optimization, and quality.
- Develops best practices around data’s standard coding practice and naming conventions, making sure data models are consistent, and establishing data modeling standards.
- Evaluates databases and data models for inconsistencies and variances, ensuring data is represented correctly.
- Presents optimization and standardization recommendations for various data systems in an organization.
- Carries out reverse-engineering of physical data models.
- Work with Data engineer, Backend Developer and Enterprise Architect to define end-to-end integration between data, GenAI models and other components in the project.
- Collaborate with cross-functional teams to integrate ML solutions into new / existing systems and applications.
- Participate in developing and maintaining GenAI models.
- Collaborate with MLOps to define model selection and evaluation criteria.
- Data Preparation and pre-processing
- Data collection
- Analysis of the content and formatting of data source (structured data and unstructured data, i.e. SOP or Memo in pdf/word, voice, image, video)
- Data cleansing and standardization
- Data formatting and augmentation
- Data optimization
- Data Model Design
- Data Processing (e.g. writing ETL and Stream processing jobs)
- Participate in GenAI model selection.
- Participate in defining optimal RAG-Based system work flow e.g. Parsing, Chunking, Embedding, Indexing (into vector DB), Prompting, Retrieval, Augmentation, Generation, Evaluation.
- Participate in performing fine-tuning (if necessary)
- Participate in ensuring the system is equipped with necessary guardrails and safety measure.
- Participate in defining model evaluation criteria (e.g. accuracy, fluency, relevance, bias, coherence, etc.)
- Conduct rigorous testing to ensure its correctness, readability, performance, and reliability. Evaluate against predefined criteria and objectives to ensure it meets the required standards and business needs.
- Create evaluation matrix
- Create visualization of model performance
- Create model performance monitoring, together with MLOps team.
Minimum Qualifications:
- Experience in data modeling, database design, and machine learning frameworks.
- Proficiency in programming languages like Python, SQL, and experience with ML libraries such as TensorFlow or PyTorch.
- Familiarity with cloud platforms (AWS, Google Cloud) and their Data and ML-related services.
- Strong problem-solving skills and ability to work in a collaborative environment.
Similar Jobs