We will notify you once we have something for you
Cacti Global
- Job Details
Job Details
- Job Title: Sr. Data Scientist
Experience: 6+ Years
Location: Mumbai/Pune (Hybrid)
Company Overview:
Cacti is a "Talent-as-a-Service (TaaS)" marketplace platform that helps bring together global talent seekers, top talent, and skill enhancers under one brand. We also provide Managed Services in the Legal, LegalTech, and Tech domains. We democratize access to global opportunities for talented professionals and empower organizations with transparent, outcome-based solutions. We have worked with clients across diverse sectors in the areas of legal and business solutions, contracts management, finance, technology, and analytics.
Roles and Responsibilities:
As a core member of the NLP team, you will research, prototype, develop, deploy and scale innovative ML/DL solutions in collaboration with legal Experts and Product Management teams.
You will develop predictive models on large-scale datasets to address various business problems leveraging advanced statistical modelling, machine learning, or data mining techniques.
Design and implement infrastructure for orchestrating end to end machine learning lifecycles
Set up processes to monitor and continually improve efficiency and performance of models
Software development including algorithm implementation, optimization, performance profiling, integration to production systems, testing and documentation.
Write high-quality production code as you build and maintain robust, scalable machine learning systems.
Program primarily in Python using efficient algorithms and software design patterns
Scale and improve performance of Natural Language systems in production
Requirements
Python Programming: Strong proficiency in Python, as it is the primary language for our application.
API Integration: Skills in integrating external APIs, specifically the OpenAI API, for making queries and retrieving responses.
LangChain Proficiency: Experience with LangChain libraries and the ability to use them for document chunking and invoking LLMs. Familiarity with the LangChain framework and its components.
Document Processing: Expertise in processing and handling documents, especially in breaking them into chunks or segments based on the requirements of the application.
Embedding Techniques: Knowledge of embedding techniques such as Word Embeddings, Sentence Embeddings, or document embeddings for representing chunks of documents as vectors.
Vector Database Management: Experience in working with vector databases (Qdrant/Elasticsearch/Faiss/ChromaDB/pinecone) for storage and retrieval of document embeddings.
Strong problem-solving skills with an emphasis on product development.
Experience creating and using advanced machine learning algorithms and statistics: regression, simulation, scenario analysis, modelling, clustering, decision trees, neural networks, etc.
Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) and their real-world advantages/drawbacks.
Knowledge of advanced statistical techniques and concepts (regression, properties of distributions, statistical tests and proper usage, etc.) and experience with applications.
Experience with text extraction from Images/Documents/PDFs.
Experience with deep learning architectures such as LSTMs, Transformers,
Experience with cutting-edge deep learningbased NLP models such as BERT.
Experience with deep learning NLP toolkits such as huggingface, spacy, etc.
Experience with deep learning frameworks like TensorFlow, PyTorch
Experience with Agile, Scrum.
Experience with proprietary and open source LLM training (OpenAI, LLAMA, Falcon,Google), creating datasets, working with embeddings and PEFT/ LORA models.
Experience with Reinforcement Learning (RLHF/ RLAIF model training) and knowledge of RL algorithms (PPO) will be a big plus
Benefits
Flexible working hours and Remote working options
Competitive salary and Bonus incentives
Health Insurance, Medical Incentives, and Travel Incentives
Professional development and mentorship programs
Opportunity to work with Global Client,
Other Details
- Industry IT Services & Consulting
- Recruiter Details Cacti Global
- Job Tags machine learning, python, deep learning, nlp
- Job Type Full time
Key Skills
- Document Processing
- Statistics
- Agile
- Scrum
- Reinforcement Learning
- Python Programming
- API Integration
- LangChain Proficiency
- Embedding Techniques
- Vector Database Management
- Machine Learning Algorithms
- Text Extraction
- Deep Learning Architectures
- Deep Learning NLP Models
- Deep Learning Toolkits
- Deep Learning Frameworks
- LLM Training
- Datasets
- Embeddings
- PEFT
- LORA Models
- RL Algorithms
Recruiter Details
- Cacti Global
- Other Maharashtra
- hidden_email
- hidden_mobile
Company Details
Cacti Global