Logo of Huzzle

Data Scientist( 2+ years with NLP exp. only)

  • Job
    Full-time
    Junior Level
  • Data
    IT & Cybersecurity
  • Pune
  • Quick Apply

AI generated summary

  • You need 2+ years in NLP, Python, SQL, and Solr. Experience in ML, Linux, and RCA is must. Strong CS fundamentals, communication skills, and passion for innovation are essential.
  • You will maintain and support AI models, train or re-train models, gather feedback for enhancements, develop configurations for client onboarding, and collaborate with teams to deliver insights.

Requirements

  • Good knowledge of Python programming language, data engineering and data science ecosystem in Python.
  • Hands on experience in developing and supporting Machine Learning solutions.
  • Hands on experience in developing Natural Language Processing (NLP) solutions, Generative AI, LLM and Advance RAG techniques.
  • Hands on experience on SQL ecosystem
  • Hands on experience on Solr framework with Python
  • Experience of working on Linux and shell scripting
  • Basic statistical modelling knowledge
  • Strong Computer Science fundamentals are must. (Data structures, Algorithms, OS, Databases).
  • Good communication and organizational skills with significant attention to detail
  • Experience partnering with cross-functional teams of domain experts, engineers, data scientists, and production support teams.
  • Strong attention to detail with excellent problem-solving skills.
  • Desire and ability to thrive in a distributed technical environment while working on multiple projects simultaneously.
  • Passionate about latest technology trends with a strong desire for innovation.
  • Self-motivated, self-managing and able to work independently.
  • Good Knowledge about developing and productionizing scalable solutions.
  • Effiecient in doing RCA for production issues.
  • Good Knowledge about deployment pipelines, github, jenkins, APIs etc.
  • Bachelor's degree (B.E/ B Tech. computer science, engineering, statistics/mathematics or a related field) from a four-year college or university, or equivalent, master’s a plus.
  • Total at least 2 years of experience working on enterprise AI products
  • Experience in supporting software products or applications in the AI ML domain.

Responsibilities

  • Work on configuration, maintenance, and support of portfolio of AI models and related products.
  • Work on AI model releases.
  • Work on either training a new model or maintaining the existing models by iterative re-training.
  • Take feedback from analysis, end users and domain experts to perform model calibration, bug fixes and enhancements.
  • Work on new client onboardings by developing configuration for model pipelines.
  • Work on model delivery to Production deployment team and coordinate model production deployments.
  • Effectively collaborate with cross-functional teams.
  • Dive deep into data, doing analysis, and discovering patterns/root causes.
  • Generate insights that drive the product.

FAQs

What kind of experience is required for this Data Scientist position?

Candidates must have at least 2 years of experience working with Natural Language Processing (NLP) and enterprise AI products.

What programming language is primarily used for this role?

Python is the primary programming language used in this role, along with experience in SQL and the Solr framework.

Is prior experience with machine learning necessary?

Yes, hands-on experience in developing and supporting Machine Learning solutions is essential for this role.

What educational background is preferred for this position?

A Bachelor's degree in Computer Science, Engineering, Statistics/Mathematics, or a related field is required; a Master's degree is a plus.

Are candidates expected to work independently?

Yes, candidates should be self-motivated, self-managing, and able to work independently while managing multiple projects simultaneously.

Is experience in cross-functional team collaboration important for this job?

Yes, the ability to collaborate effectively with cross-functional teams of domain experts, engineers, data scientists, and production support teams is important.

What type of models will the Data Scientist be working with?

The Data Scientist will work with AI models, including developing, training, maintaining, and supporting Natural Language Processing (NLP) solutions, Generative AI, and advanced Retrieval-Augmented Generation (RAG) techniques.

Is there a specific framework that candidates should be familiar with?

Yes, candidates should have hands-on experience with the Solr framework and must also be comfortable working in a Linux environment with shell scripting.

Will the Data Scientist be responsible for model deployment?

Yes, the role includes coordinating model production deployments and collaborating with deployment teams.

What skills are necessary for maintaining production issues?

Candidates should be efficient in performing root cause analysis (RCA) for production issues and have knowledge about deployment pipelines, GitHub, Jenkins, and APIs.

When you have to be right

Technology
Industry
10,001+
Employees

Mission & Purpose

Wolters Kluwer (EURONEXT: WKL) is a global leader in professional information, software solutions, and services for the healthcare, tax and accounting, financial and corporate compliance, legal and regulatory, and corporate performance and ESG sectors. We help our customers make critical decisions every day by providing expert solutions that combine deep domain knowledge with specialized technology and services. Wolters Kluwer reported 2022 annual revenues of €5.5 billion. The group serves customers in over 180 countries, maintains operations in over 40 countries, and employs approximately 20,000 people worldwide. The company is headquartered in Alphen aan den Rijn, the Netherlands.