View on GitHub

FAMEF

Project Summary

Framework Metadata Extraction Framework (FAMEF) extends the existing framework SOMEF and extracts scientific software metadata and its functionality from documentation. This streamlines the work of researchers who are looking to reuse scientific software. We have trained classifiers(supervised) which detects different categories of metadata. We will be evaluating the performance of classifiers against unseen repositories different from the ones used in training.

FAMEF is conducted by Pratheek Athreya, Ling Li, Sharad Sharma, Yi Xie, Yidan Zhang.

Wastes

1. Transport: Since the meetings with the client were only required for the project manager, the client had to talk to the project manager about the requirements and concerns, and then the project manager would communicate with the person in charge. This process involved a significant overhead. To avoid this waste, our project manager encouraged everyone to attend as much as possible.

2. Over-Production: In the beginning, we forked the SOMEF and uploaded our code and CSV files on it. Later, we decided to create a new repository and called our project “FAMEF” so that we had to upload our existing code and CSV files to the new repository again.

3. Over-Processing: Since we worked remotely, which made communication difficult, we sometimes had to explain the same questions multiple times.

Deliverables

  1. Create a Google Slides presentation including all information of the project scope
  2. Create 5 CSV files for extended corpus after adding new repositories
  3. Improve the performances of 4 existing classifiers
  4. Create a functionality classifier and add an extraction result to the output JSON file
  5. Create a Google Slides presentation including project summary, motivation, objectives, results and resources, and conclusions for final presentation
  6. Create a demo video for final representation
  7. Create result visualizations to show the performance improvements for presentation and report
  8. Publish a project report

Milestone

Gantt Diagram

Image of GD