Reading
Associate AI (NLP Engineer)
Resume parser or AI Assistant
• Tech stack used - Fast API, finetuning NER BERT models, finetuning detectron2 (Faster CNN), AWS S3 bucket, PostgreSQL database, OpenAI (finetuning and prompt engineering), FastText model training. ReactJS, Azure deployment.
• As an AI developer, I single-handedly developed the resume parser project from scratch and successfully released it to production within a year.
• Led a team of 10 data preprocessing analysts to support NER and section layout model training based on data annotated using Label Studio.
• Completed the first phase using Python Flask framework and converted it to FastAPI to enable asynchronous processing and to improve the performance.
• Completed numerous POCs as a part of this research project such as Detectron2 layout detection with 1000 training data, NER model training, and finetuning using different BERT NER models, finetuning models using Optuna to optimize the model to avoid overfitting.
• Created a resume viewer using ReactJS as frontend and Python FastAPI framework as backend enabling data access to the interns without providing them the physical data to ensure data privacy.
Smart candidate search
• Tech stack used: Elastic search, BERT model (paraphrase-MiniLM-L6-v2) for vector embedding, Django.
• Developed a smart candidate search to search the candidates with their skills, job titles, years of experience, and location using vector similarity search between the query and the sentence created using the candidate’s data by prioritizing the abovementioned four parameters.
• Candidate’s index was created in Elasticsearch and a new key called “vector” was added by creating vector embedding using the BERT model. Vector has been stored and made use of vector similarity search in Elasticsearch. Checking the similarity of the query vector and sentence vector helps us provide the most similar data on the top.
• As phase 2 of the project, used OpenAI’s speech recognition model (Whisper) to convert voice to speech to integrate it into the smart search so that recruiters can either provide text or voice as input to the search engine.