Software Tools

For the hackathon the lowest order goal is to be able to run a simple, likely with not-good-enough results, Convolutional Neural Network (CNN), just to at least go through all the motions. This will demonstrate the "Findable" and "Accessible" parts of the FAIR principle. For this we will download a dataset (under 1 GB and consisting of dicom images with two labels), load it to Google drive, and build an run the CNN on Google Colab with a free TPU. Alternately the same code can be run on other dedicated resources that participants may have access to.

The more exciting part is to come up with problems to solve using the (much larger) datasets given that include dicom images, metadata, and even some datacubes. Given the short duration of the hackathon it is not expected, or likely, that any such problems will get solved during the hackathon. The hope is that this will be a start of brainstorming in defining the problem, associate specific datasets with them, and outline a path. Larger computing resources will also be needed to handle the workflows than are available at the hackathon.

Envisioned problems:

To enable such possibilities, the three datasets on offer are all related to one organ (breast), and have associated control samples. The hope is to make similar well curtaed and documented datasets available for other organs as well.

The following tools will be useful:

Additional/alternate tools: