Delete search term

Header

Main navigation

Students at the ZHAW productize ChatGPT, Midjourney and Co.

Computer science students at ZHAW worked in groups to build new tech demos, using the latest generation of generative AI as part of the subject "Artificial Intelligence 2". The goal was to gain practical experience with the available tools within a few working days and to get to know the associated open-source landscape. In the process, it became clear that even with little effort, exciting and extremely creative applications can be found and created with these technologies.

The following demonstrators stood out in particular: 

fAIritale (see Figure 1) - creates short bedtime stories based on anonymized real children's drawings. For each story, a drawing is annotated with a few keywords. Then, using this text and the image-to-image model Stable Diffusion, a similar-looking but anonymized illustration is created. The matching story is generated by GPT-3.5 Turbo using the OpenAI interface. Both model queries include prompt engineering to generate results that match the story theme.

 

Improving object detection with LLMs (see Figure 2) - uses GPT-3 to increase the robustness of object detectors. For this purpose, a standard object detector (YOLOv5) is fed with images that are prepared to provoke false detections (a so-called "PGD Adversarial Attack"). The list of detected objects is then sent to GPT-3 with the question whether all detections fit into the same scene based on the context. Thus, inappropriate detections can be identified and subsequently removed.

Image Freedom: A Machine Learning Approach to generate royalty free Images see Figure 3) - generates royalty free images that have similar content to another potentially non-royalty free image.  To do this, an image-to-text model is first used to create a detailed description of the image; the resulting text is then fed into a text-to-image model. This creates a new image that is visually similar, but also represents the original content. This application could be conveniently accessed via cell phone using a Telegram bot.