Project A: Diagnosing Disease
In this project, you will create a neural network that diagnoses patients for cirrhosis (severe scarring of the liver) based on numerical and categorical factors. You will use the dataset cirrhosis.csv.
- The dataset has some missing data (“NA”). Remove all rows with missing data.
- The input data
X
should consist of all the feature columns except “ID,” “N_Days,” “Status,” and “Drug.” Drop those columns. - Convert categorical data into numerical data using one-hot encoding (see Other Machine Learning Techniques). You may find the Python function
get_dummies()
rather useful. - The response variable (output) should be set to Status. Status is either “D” for death of the patient or “C” or “CL” for no recorded death of the patient during the trial. The following code should be used to transform the string data into numerical data:
y = df['Status'].apply(lambda x: 0 if x=='D' else 1)
. - You will also need to convert string data to numerical in the input dataframe,
X
. Use the following code to do this:
X = X.replace('F',1)
X = X.replace('M',0)
X = X.replace('Y',1)
X = X.replace('N',0)
- Now you’re ready to do the train/test split and build your model! Using
TensorFlow
, set the number of units in the hidden layers to 16 and 16 (two hidden layers), and set the number of epochs to 10. This model converges very quickly on the data, so there is no need to run many epochs. - Record the final loss and accuracy of the training step as well as the accuracy score of the model on the testing set. How did your model do? Based on the accuracy score, how confident would you be in using this model to diagnose cirrhosis?
Project B: Recognizing Digits
There are many real-world applications that rely on the ability of technology to recognize handwritten digits, including reading forms automatically, postal code recognition in mail sorting, automated grading, and medical record digitization.
- With the help of
TensorFlow
, build a convolutional neural network (CNN) model in Python and train it to recognize the handwritten digits in the MNIST dataset. This article on machine learning walks you through the process, including detailed discussions of variations and improvements on the model. - Try out different learning rates, adding or removing layers and other variations, and report on their effectiveness.
- As a group, discuss the advantages and disadvantages of the CNN model as compared to other potential models for classification of digits.
Project C: Artistic License
In this project, you will use NLP/LLM models, such as ChatGPT and OpenArt, to create an “original” story with illustrations.
- As a group, brainstorm themes and characters for a story. Then use appropriate prompts to guide the LLM in creating a coherent narrative, with a clear plot and realistic characters. Explore different genres and themes, such as adventure, fantasy, or educational content. The team should edit and refine the generated text to ensure coherence and appropriateness for their intended target audience.
- Use AI-based art generation tools to create illustrations for the story. The team will create prompts based on the generated story to guide the art generation process, ensuring that the illustrations align with the narrative and experimenting with different artistic styles and techniques to create a visually appealing storybook. The group should refine and select the best illustrations for inclusion in the final product.
- For added interest, use AI-based music composition tools to create background music for the storybook. The group will select appropriate music styles based on the story's mood and setting. The music should enhance the overall storytelling experience.