Project Summary
Framework Metadata Extraction Framework (FAMEF) extends the existing framework SOMEF and extracts scientific software metadata and its functionality from documentation. This streamlines the work of researchers who are looking to reuse scientific software. We have trained classifiers(supervised) which detects different categories of metadata. We will be evaluating the performance of classifiers against unseen repositories different from the ones used in training.
FAMEF is conducted by Pratheek Athreya, Ling Li, Sharad Sharma, Yi Xie, Yidan Zhang.
Wastes
1. Transport: Since the meetings with the client were only required for the project manager, the client had to talk to the project manager about the requirements and concerns, and then the project manager would communicate with the person in charge. This process involved a significant overhead. To avoid this waste, our project manager encouraged everyone to attend as much as possible.
2. Over-Production: In the beginning, we forked the SOMEF and uploaded our code and CSV files on it. Later, we decided to create a new repository and called our project “FAMEF” so that we had to upload our existing code and CSV files to the new repository again.
3. Over-Processing: Since we worked remotely, which made communication difficult, we sometimes had to explain the same questions multiple times.
Deliverables
- Create a Google Slides presentation including all information of the project scope
- Create 5 CSV files for extended corpus after adding new repositories
- Improve the performances of 4 existing classifiers
- Create a functionality classifier and add an extraction result to the output JSON file
- Create a Google Slides presentation including project summary, motivation, objectives, results and resources, and conclusions for final presentation
- Create a demo video for final representation
- Create result visualizations to show the performance improvements for presentation and report
- Publish a project report
Milestone
- Scoping: The project scope is very important to help the team understand the project and enables the project manager to allocate the proper labor to complete the project. To scope the project properly, the project manager should organize the team, learn about the background of team members, discuss the project, and communicate with the client timely. This milestone was focused on addressing these concerns and completing related documents, like slides and reports. Also, the project manager should check with the client, discuss within the team, and prepare for the development phase.
- Four Existing Classifiers Improvement: This milestone was focused on one of the main objectives of FAMEF. The team would work on improving the performance of the existing four classifiers, which includes description, installation, invocation, citation, by exploring different classification models.
- Functionality Classifier Creation: This milestone was focused on one of the main objectives of FAMEF. The team would create a new classifier — functionality — for the software and find the model with the best performance. Additionally, the developers should add the extraction result of functionality to the output JSON file.
- Presentation (Demo) and Report: This milestone would be focused on preparing for the presentation, presenting results to clients, and submitting the project report. To prepare for the presentation, the team would create result visualizations, record a demo video, publish slides, and attend a dress rehearsal.
Gantt Diagram
