Thesis Project Overview

In my short animated film Remembrance, I attempt to capture the feeling of Alzheimer’s through the power of 3D animation and the assistance of emerging AI image generation. The piece wrestles with the confusion, grief, and loss of identity that comes with the disease. At the same time, the film centers around the emotional journey of Hope as she finds herself thrown back into happy childhood memories of her father playing piano for her.

This thesis came from my personal experience with my Grandmother JoAnn who had dementia/Alzheimer’s. I chose the medium of 3D animation, along with experimental machine generated images, to tell the story as these art forms provide the most flexibility to subjectively portray the experience of Hope. I wrote, animated, directed, and edited the three minute long film. To my knowledge, it is the first ever 3D animated honors thesis at Emory University. Additionally, Remembrance aesthetically experiments in the medium of 3D animation with tactile textures portraying subjective reality. The film also marks one of the first narrative stories assisted by AI image generation. 

In the following paper, I explore the process of developing the story and outline the creative and technical challenges of producing an animated film. Once I settled on my narrative, I researched psychological papers on Alzheimer’s and took inspiration from experimental methods of showing thoughts in film. Throughout production, the film grew to fit my technological abilities and the new industry of artificial intelligence. This thesis is a testament to emerging technological forms of storytelling and foremost to the memory of my Grandmother and all who have experienced family members with dementia/Alzheimer’s.

Part 1: Finding the Story

One of the most challenging aspects of my thesis was deciding on a story. My initial proposal from the Spring of 2022 centered on humanity’s relationship with social media and technology. I chose animation as the medium as it has a unique ability to portray digital and abstract spaces as a physical environment. This aesthetic desire cemented my core philosophy in how I wanted to approach an animated thesis: story and medium as one.

During the summer of 2021, I wrote a dramatic story of a teenager being pulled into the world of social media. I wanted to combine mediums to expressively animate that overwhelming feeling. With every draft, I realized the inherent scope of the story continued to inflate in size. To combine 3D animation, stop motion, and 2D animation into one twisting narrative would be too much to pull together in nine months. So, in August I began to reconceptualize the entire story. I went through seven completely different scripts, storyboarded three full films, and wrote out themes and life experiences on countless notecards to try and crystalize some emotional truth deep inside myself. Some days it was exhilarating, and other days it felt like I was a complete failure of an artist. If I do not know what story I want to tell, should I even be a film director?

At the beginning of the Fall 2022 semester, I decided upon a simple story about a window washing robot trying to cheer up a sad office worker through the power of dance. It was sweet and two pages long, but I was still hesitant to begin production. I had years of experience in 3D animation simulating water, animating spaceships in a dogfight, and designing intricate procedural textures, but I lacked much experience modeling, rigging, and animating a living creature with realistic emotion. After consulting many 3D animators on the time required to animate such complex dancing movements–approximately 1,000+ hours–I decided that the project’s scope was too large.

Early on I was already facing the prime intellectual challenge of my thesis: balancing the technical and conceptual artists inside myself. I wanted to push myself and learn something new with this project, but how could I know when my technical abilities clash with the concept of a story? I did not want to start skiing down the slope of a hill to realize halfway down that it was Mount Everest–however, I couldn’t wait and survey all the slope options forever.

In September, I pivoted once again and rediscovered an old story from my notes about a child watching raindrops on the window of a car and imagining them as race cars. I began workshopping that idea. After two weeks of storyboarding, scripting, and technical research, I remained uneasy about the concept. The project was feasible, but the story felt emotionally shallow.

Then, in October, unfortunately, my Grandmother JoAnn passed away. While I was dealing with all of my emotions and family dynamics, it felt even more absurd for me to make a fun film about a child watching race car water droplets. All I had of my Grandma was in my memories, and through many days of journaling, writing on bits of notecards, and storyboarding, I began to realize the current story of my thesis: a story about an old woman with dementia/Alzheimer’s.

Part 2: Development of the Story

(i) Emotional Inspiration

It was important to me to not tell a purely sad story. My Grandmother, for all the horrible days she did have towards the end of her life, also had some genuine moments of joy. I remember one family dinner we had a few years ago–when she could not remember any of her family members–and I spent the whole dinner asking her questions about her childhood. She had told me the stories many times before: of growing up in a Croatian neighborhood in Dearborn Michigan, her adventures with her Italian boyfriend Jimmy Bruno, or how she would walk every day to school with her best friend and speak her “home tongue” before they had to switch to English at school. I’d ask her various questions until the right one sparked some memory inside of her and she got excited and started retelling the whole story. I repeated this process for many dinners and phone calls in the last couple of years of her life. At the end of this particular dinner, she turned to all of us, smiling, and said: “I do not know who any of you all are, but this meal has made me so happy. Thank you all.”

This moment was the feeling I wanted to capture. It was certainly sad to see her confused, but my Grandma somehow still maintained her wit, compassion, and love that were core to her character. My Grandmother still existed under the fog… we just had to lead her out.

(ii) Research

It is important to me to portray Alzheimer’s and dementia properly. Through the research of psychological literature, I solidified my understanding of the different ways Alzheimer’s/dementia affects the brain. It can impact the ability to recall recent memories, recognize family members, and recognize or name objects. I also discovered the concept of music therapy, in which patients with dementia listen to music from their past which can increase short-term recall and self-consciousness (Raglio 130). These studies crystallized the story of an old woman triggered by music from her past entering a moment of lucidity. From my personal experience attempting to elicit such a response from my Grandmother through questions, it felt natural to have music be brought as a part of a regular visit from Hope’s family member, her son Sam.

Before further discussing, I would like to briefly distinguish the difference between dementia and Alzheimer’s, as they are not the same. Dementia is a general decline of mental cognition and memory until it affects day to day life. Alzheimer’s disease is a common form of dementia.

With the story dynamic solidifying, I began to research films on Alzheimer’s–including Still Alice (Lisa Genova, 2014), along with the animated short films Undone (Sara Laubscher et al, 2019) and Memorable (Bruno Collet, 2020). Each film captured different aspects of dementia/Alzheimer’s, and while Memorable captured the conflict Alzheimer’s can cause in a relationship, I was disappointed by the lack of depiction of what it truly feels like to be in that state of confusion. In Still Alice, for instance, the audience remains largely objective and witnesses Alice’s deteriorating experience from the perspective of a family member.

I saw a unique opportunity to explore the feeling of memory and how that links with self consciousness from the perspective of someone with dementia/Alzheimer’s. I wanted to ask, what does it feel like to exist in a world so unfamiliar? Furthermore, what does it feel like to suddenly connect with a flurry of memories of childhood?

(iii) Development of the Memory Sequence

To portray memory in film I took strong conceptual inspiration from Mind Game (2003, Masaaki Yuasa). While the film’s content has no relation to my thesis, it achieves a moment of pure thought better than any other film I have seen. The movie opens with a quick paced montage of shots with seemingly no relation to each other. These shots are in different locations, show unfamiliar characters, and make no sense upon first viewing. The rest of the film occurs linearly and slowly reveals the backstory of the characters. The same montage is shown at the end, but now each shot is contextualized in the story of the film, and suddenly the entire sequence takes on a euphoric connective experience. In just a few frames the viewer connects any new image to an emotional scene from earlier in the film. Through expert editing, Yuasa connects shot by shot to form an array of concepts that feels the closest thing to pure thinking on screen.

The thought-like effect of this montage is very close to my approach to memory. In my subjective experience, memory is not linear and our experience of it can shift rapidly as our thoughts change. It feels like one long sentence that does not know how to stop. It was clear to me that I did not want to tell a traditional linear sequence, but that the memories must feel continuous and connected through the natural chaos of one’s thought process.

To accurately capture the memories of an old woman, I began interviewing my older family members about their memories with Grandma JoAnn in childhood. These conversations built out concepts of family road trips in an old Ford, barbecuing lamb, a factory worker father, and the importance of ethnic Yugoslavian food. These details later filled out the world of her childhood.

The sequence of memories slowly became the centerpiece of the film, and I wanted them to feel as natural as thought. First it would show the core memory of Hope’s father playing piano, and then that would spark Hope into a rush of thoughts about other aspects of her childhood. Deciding on the content and order of the images was incredibly difficult, and not something I completely figured out until months into the process. From research I knew that smell plays a large part in memory, which eventually drove me to emphasize the smoking of the father, the sweet grass of the outdoors, and other tactile images.

Early on, I knew two aspects I wanted to maintain for this sequence: (a) the linkage between frames should feel like an expanding thought process, and (b) the memories should come from a tactile perception of smell, audio, and textures. With these conceptions in mind I moved into more specific planning for the rest of the film.

(iii) Animatic / Script Writing

My first real creative step was to draw out all the storyboards on notecards. I found it easier to go straight into visualizing the piece rather than writing as I would not get stifled by the syntax of the script or character names, and could instead focus purely on the visual language of the film. Through immediately drawing the frames I discovered the perspective of the piece.

By nature, a short film only shows a few minutes of someone’s life, and as such I wanted to consciously use that limitation to give the feeling of someone who does not know their past. So to open the film with a wide shot from behind Hope–not detailing her character–immediately tells the audience everything in her world. She knows the window in front of her, the chair beneath her, and has a vague sense of a hospital setting. Beyond that, the world falls off into darkness. While the shot is a “traditional” wide angle, the darkness on the edges frames the world the viewer enters before they even know they’re in Hope’s perspective. To then cut into the close up on her as she sits, breathing, emphasizes the stillness she feels in her life and fully adjusts the audience into her perspective.

Next, footsteps approach, but Hope gives no acknowledgment and the shot stays on her. I intend this moment to build up a subtle anticipation in the viewer to want to see the source of the footsteps, and by consciously not immediately cutting to reveal the new character, I hope the shot implies the listlessness that Hope feels. The world moves around her, and she remains motionless.

Initial storyboards from October 2022

I incorporated rain for multiple reasons. The raindrops sliding down the window and disappearing is thematically analogous to Hope’s experience with memories. The temporality of the drops indicates the fragility that Hope herself feels. The rain also reflects the internal state of Hope. In the opening shot, there are only remnants of rain laying still on the glass–just as Hope feels motionless. When the recording starts the rain picks up and begins to roll down the glass. The drops of rain imbue the energy shift that Hope feels as she becomes more alive.

The final important decision I made early on was to very cautiously show Sam’s character. By focusing largely on Hope’s character I hope the viewer stays connected to her. At the end, despite the invigorating journey of memories, she still has Alzheimer’s. She has still lost her ability to recognize her family. When she thinks Sam is Papa, to cut to a morphed version of her Son would immediately tell the audience that Hope does not remember him. To hold off on that visual and reveal that he is her Son, purely through his dialogue, allows the viewer to slowly experience the confusion at the same time as Hope.

With these beats roughed out, I took the storyboards and digitized them in Blender to create a rough 2D animatic. I also wrote out a script and began workshopping it with Professor Barba and the other creative producer, Kheyal Roy-Mieghoo. Over several weeks, the story beats of the film began to take form. Dialogue was added. Beats were moved around. I found the right piano piece online. By the middle of November I had this more polished 2D animatic.

Selected frames from the animatic

At this point, I had a vague idea of creating a 2D animated memory sequence and feeding it into AI to add an abstract level of detail that would morph over time. AI imaging was just coming into fruition at this point, and I saw great potential for its abilities to assist in animation. The few AI videos that existed at the time showcased its ability to morph and evolve images, which immediately struck me as memory-like. With the concept more solidified, I began to truly research the technical side to ensure that the major points–in particular the memory sequence–were feasible.

a super rough animatic from early in the process

(iv) Technical Research

The technical research included 3D modeling tutorials in Maya, liquid simulation tutorials in Blender and Houdini, and copious amounts of AI image generation. While the 3D tutorials were relatively straightforward as I was already familiar with the CGI process, AI image generation was completely new to me, and as such, required more time.

I started the research by learning how to prompt still images with an online AI interface. Prompting refers to the process of providing a text phrase to the AI model which generates an image consistent with the prompt. The process of figuring out what exact phrasing to use can get complex, for example if one uses the same words but switches the order of two of them, then the image could vary dramatically. I learned over time that there is a certain syntax of prompting that produces more accurate results:

(type of artwork) of (subject) with (subject details), (styles)

For example:

“Pencil drawing of a little girl with a rose, highly detailed, cinematic lighting”

In my later research, I found this extremely helpful and detailed guide to prompting online: https://openart.ai/promptbook

My first ever AI images:

“3d animated window washing robot cleans skyscraper Pixar”
 “flower floating city sunrise ethereal fantasy hyperdetailed mist Thomas Kinkade oil painting by James Gurney”

These first images showcase how the quality and structure of prompting can dramatically affect the quality of an image. Once I became comfortable with prompting, I began doing some video tests. At the time, there weren’t any online tools built for processing video through AI, so I had to manually export every frame, run it through the same set of AI prompting and settings, and then download and recompile the AI photos. I did this process for live action video, 3D animation, and finally 2D animation.

For 2D animation to AI, I learned one had to exchange added detail with lower between frames. The grid on the right shows four variations of seeds on the same input of a jumping animation. Notice how much variation there is between the grid images. This effect leads to an incredibly warped animated video.

With this test I started to sense that my conception of the 2D animation to AI memory sequence would need to shift to meet the technology, but these initial tests gave me the confidence to go forward with the project. To concentrate my focus I split the production portion into two separate halves: 3D Animation and AI Animation.

Table – Treemap chart tracking hours worked on project 

For the context of each stage, the above treemap chart shows the time I manually tracked in each stage of production–with square size in proportion to hours spent. The entire project totals 300+ hours. The figure could have some inaccuracies if I forgot to record, but it roughly shows where my time went–not including the 80+ hours on early concepts or 50+ hours spent in honors class in the fall semester.

Part 3: 3D Animation

(i) Prepping

Starting production was scary. I constantly felt like I needed to learn more. I talked with everyone I knew in the 3D industry about how they learned animation, advice for modeling and texturing, and how they would approach the project. I started modeling the main character but the process of achieving proper topology, good character design, and proper level of detail took a lot of time. A week into modeling, I realized that it would take another month to achieve the quality I wanted. I paused and asked myself an important question: what is really important? To spend more time modeling would take away from the entire development of the memory sequence. Once I realized my priorities I immediately decided to use premade online models for Hope and the environment. Now, the first large creative process in 3D was the texturing of the world and characters.

(ii) Texturing

Creatively, I felt there was a real opportunity with the textures to emotionally bring the audience into the mind of an old person with dementia. The black and white color palette creates a natural emotional contrast from the colorful memories and serves to reflect the blandness Hope feels. The hand drawn wrinkles are intended to make her feel ancient and filled with worn out emotion. The lack of real skin details, favoring large sketchy lines, indicates the level of vividness in which she experiences the world. A primary challenge of CGI is the feeling that the world does not have weight. Contextualizing the entire world in a paper like texture immediately imbues the animation with the fragile feeling of paper familiar to the audience. The tactility also hints at a broader level of subjectivity in the space. We do not occupy our world anymore, but rather a world that is light and faded.

There are three main factors in the material of an object: diffuse (color), roughness (how shiny it is), and bump (micro three dimensional details). For paper, the diffuse and roughness are largely uniform, so I could keep those at a constant white value while using a noise pattern as the bump to achieve a realistic paper effect.

For the face details, over a week I hand drew the sketchy lines on my drawing tablet and projected the image onto the face mesh in a process called UV unwrapping. I fed this photo into the diffuse texture socket which completed the texture.

(iii) Animating

Heading into actual animation I collected as many notes as I could from animators and books online. Fortunately, I had just spent over a year learning 2D animation on a different short film, and all of the core principles–from squash/stretch, anticipation, secondary action, et cetera–still apply in three dimensions. The tricky part working on a computer was operating a rig and the technical minutiae of editing the graphs of transformations. Additionally, I faced a creative challenge of attempting a more subtle performance than my previous work. I worried that over exaggerating emotions like I had in 2D animation would come off as childish and take away from a viewer’s ability to connect with the complex feelings of Hope.

To assist the animation I went through hours of video of people with Alzheimer’s to see how they acted, how fast they moved, how frequently they blinked, how heavily they breathed, and how they expressed emotion. I used these videos as reference throughout the process. To make the animation more manageable, I broke each beat down in an excel sheet:

Week by week I worked on each moment. My rough steps for each beat were:

  1. Film reference footage of myself doing the action
  2. Rough block out of the key poses with constant keyframes (each emotion would snap to the next and there wouldn’t be any motion interpolation between)
  3. Change to bezier keyframes (motion interpolation)
  4. Receiving for notes from animator friends / producers on the project (these might be something like “blinking is 2 frames too slow here”, “have the head dip down to make it more realistic”, “what if she feels more excited than shocked?”, et cetera)
  5. Adjustment from notes
  6. Move on to next beat

I ran into many difficulties throughout this animation process. I had to learn the non linear animation (NLA) editor in Blender which required many tutorials. Early on I realized that the rig that came with the online model was not working properly, so I redid the rig from scratch over the course of a week. A character rig is the invisible skeleton which animators manipulate to pose characters. To have the correct part of the mesh to move when the bone moves one must paint a weight texture on the vertices of the mesh. Weight painting was an evolving process and one that I returned to throughout animation when the rig malfunctioned. For example, one particularly difficult glitch was the elbow of Sam popping out of place when moving a certain direction. It took me a week of talking with 3D animators online and watching YouTube videos to fix the issue, which was due to a simple misaligned axis pole for the forearm bone.

Once I had a rough animation of the beats, I started to render each shot to edit together more precisely.

(iv) Rendering

Rendering is the process of creating a final image from a 3D scene. The process calculates the light paths in the scene which results in colors, shadows, reflections, and other inputs that influence the final pixels in a CGI image. Rendering is usually completed along with or immediately after lighting as they can affect each other.

For most of the production I rendered at relatively low quality, which meant there would be significant noise in the image, however, it would only take 10-30 minutes to render a full shot. For final quality renders, it could take 1-2 minutes per frame, which translates to 5-15 hours for an individual shot. Long rendering time meant that I had to strategically render certain shots overnight throughout the production.

Part 4: AI “Animation”

(i) Learning the Technical

My introduction to image generation began with the online interface NightCafe. This pay to use system allowed me to easily play with different AI types with no technical knowledge, and I tried as many as I could get my hands on. Midjourney I found it to be too simple and not flexible enough through the discord interface. Dall-E was great but I had to be on a waitlist for full local access. Stable Diffusion proved to be the most flexible. With increased manipulability came some loss in inherent quality, but that still meant I had a lot more control over the image.

The easy interface of NightCafe online     The interface of AI installed directly on computer

After two weeks of figuring out how to properly install Stable Diffusion and updating computer parts, I was finally ready to generate my own images at full computer power in late December. This step forward opened my eyes to all the factors of image generating, including:

  • The prompt itself
  • The negative prompt (which tells it to avoid certain images)
  • Classifier free guidance scale “CFG” (how closely it tries to match the prompt)
  • Resolution (models are trained on a specific resolution, typically 512×512, which means it produces much better quality images at that resolution)
  • Steps (how many levels of noise removal the AI will perform on the initial image, essentially increasing the quality)
  • Seed (the base noise pattern)

Before continuing, I want to provide a quick note on my understanding of how AI image generators are trained and work. AI models are trained on a massive set of images and their associated words. The machine takes in the words and image pairs and generates descriptive dimensions to compare various images and their words. These dimensions form what is called “latent space.” The machine then trains itself trying to replicate an image from taking a base noise pattern (seed) and iterating a visual transformation (steps) until it achieves something close to the desired image (CFG). The AI generates its own version of latent space from millions of images and their associated words scraped from internet metadata.

In the end, the model of each AI does not include any of the original training images, but rather the latent space trained from said images. This method of training means that any prompt inherits all the biases of the internet’s images and associated metadata. It is still a probability game though–the input “engineer” will not always return a white male, but most of the time it will. I learned to be aware of this bias and specify the age and ethnicity of my characters to maintain consistency, but the ethical implications of this factor cannot be understated and deserve a paper of their own.

(ii) AI Video

My conception of AI image generation led me to believe that I could input an image and iterate abstract details on top–similar to a filter. My initial plan was to 2D animate the rough morphing of memories, and then run that through AI to achieve a more loose abstract sequence. Unfortunately, the process was tedious, had little flexibility once the animation was set, and gave weird aesthetic results. I wanted to find another option that was smoother and allowed for more control over iteration.

Through online research and talking with a friend in the AI industry, I discovered a video creator extension for Stable Diffusion called Deforum. This became the key to the entire look of the memories. Deforum uses the Stable Diffusion model of image generation with the added ability to morph between prompts over time. Essentially, I could feed the software two different prompts and generate images transitioning between the two concepts.

The element of time added many new variables to control. The most important one was the “strength” value, which controlled how much one frame should look like the previous frame. Any tiny shift in the seed of one image would exponentially change the future images. This meant it was hard to tweak only specific images later in the sequence, and I would have to rely on generating hundreds of versions of the same prompt sequence and picking which portions to include later.

While finding Deforum was extremely exciting, there remained one huge practical issue: to change the prompt over time, animate camera movement, or adjust the “strength” one had to type out exact frame numbers for each change. This was a tedious and time consuming process. In order to have any efficient artistic adjustments, I had to find some way to map out frame values spatially first, and then input them into Deforum. Over a few weeks I built out a spreadsheet to assist with this.

Custom “Deforum Timeline” Spreadsheet

This sheet above automatically keyframes strength values. With every new prompt the strength lessens and allows the image to quickly transition. It also allows me to visually see the pacing of each new image and how the style variations change over time. I could relatively easily make any changes to timing of prompts, then copy paste one giant text file for all the text with their correct frame values and automatic strength transitions.

The final step was to animate the movement of the images. Most Deforum art at the time had a slow continuous zoom in with a slight looping sway to the translation. For telling a narrative story, I thought it would be more apt to have realistic camera movement, as if we are experiencing a point of view. I found an addon in the 3D software Blender which allows the export of 3D camera animation keyframes to a text file compatible with Deforum. I then animated a 3D camera with Blender’s graph editor and input those values directly into Deforum.

Creatively, finding Deforum led me to shift the memory sequence towards a snapshot-like flow of memories. After rigorously experimenting with the strength value, I was able to achieve a morphing effect that rapidly shifted, akin to the thought driven montage I desired initially. Stylistically, I wanted a colorful impressionist painting aesthetic. The color contrasts with the bleakness of the real world which heightens the emotions of the memories. The painted texture brings a level of tactility and weight to the images that I hope helps ground them.

A quick clarification: “AI Animation” is admittedly an incorrect name. “AI video” might be more appropriate, but perhaps the most accurate description would be “AI interpolation”. Deforum does not recognize the motion required for animation or the subject matter required for video, but rather takes a set of text prompts and interpolates between them.

(iii) Entering Actual AI Production

Now that I had the technical process I could finally start delving into the style of the memories, what prompts work best, and build the content of the sequence from there. The sequencing of images was the most difficult part. The emotional core was the father and daughter moment playing piano and dancing. I wanted it to feel nostalgic, naturally bleed into other memories, and most importantly find the balance between abstract thought and clarity of narrative. I developed a four part structure to the memories:

  1. Specific memory (piano and dancing)
  2. Bridge to a series of memories in the same outdoor park (picnic, teddy bear, etc)
  3. Expand to a flurry of childhood moments (birthdays, family dinners, road trips, etc)
  4. Return back to the roots (specific piano memory)

I then delved into the specific style and prompts to use. This involved creating prompt matrixes to test out the effect of different words. There are four main aspects of the prompt that I honed down over a few months: subject, style, variation, and negative. Subject is the actual main focus of the image (ie: “30 year old father playing piano in cozy room”). Style is the visual representation of the subject that remains consistent (ie: “oil painting of ___, thick strokes of paint”). Variation is a personal term I developed to describe additional details of the image’s subject that might change over the course of the video (ie: “wide angle, 14mm, creepy”). Negative is the stylistic or subject to avoid (ie: “distorted, text, cloned”).

Through these tests I found what words efficiently achieved a defined, impressionistic, and tactile painted result. The more words one uses in a prompt the less weight each prompt has, so I had to be as concise as possible, and eventually developed this style prompt for the piece:

“oil painting of ___ , thick strokes of paint, cinematic lighting, beautiful composition, faded abstract nostalgic memory, impressionistic”

SUBJECT prompt matrix examples – core prompt: “oil painting of 30 year old man smoking cigar, thick strokes of paint, cinematic lighting, beautiful composition, faded abstract nostalgic memory, impressionistic”

Deciding on subject prompting was the easiest of the four parts. Adding details such as “wearing a fedora” the image naturally began to look more old fashioned, and also had more consistency. I found through these tests that the more specific I could be with clothing, ethnicity, and age the better the more cohesive the images remained. 

VARIATION prompt matrix examples

The hardest variation effects to achieve were a sense of perspective and an old fashioned style. I wanted the camera angle to give the point of view of a child–ie low angles, a wider focal length, et cetera. I have a sense that because the images were generated in a painted style, the efficiency of camera perspective vocabulary was diminished as paintings do not have the same descriptions. Over time I found that words that describe what one would see from looking up–ceiling, sky–worked much better at achieving this perspective.

I also tested using filmmakers/artists to help achieve a better look. These additions generally added to the quality–primarily with lighting and definition of the subject. However, in the end I decided to not use any specific artist out of respect for their work, which I further discuss in the ethical portion of this paper.

One main challenge throughout this process was giving detail and specificity to the memories. The inherent nature of how the AI is trained means it creates an image that averages out all other online images of the prompt. To create an older aesthetic I had to do some research and careful prompting. I researched fashion trends of the 1930s and 1940s and found key reference points such as “shirley temple” and “little rascals.”

 NEGATIVE prompt matrix examples – core prompt: “photo of sheep in field”

For negative prompts, I admittedly could not figure out how to generate a prompt matrix. I was able to find the image above online by the reddit user SnareEmu. This particular image illuminated many aspects of prompting for me: 1) prompts are largely probability, and not a certainty and 2) the issue of duplicate people/objects was going to be the hardest thing to avoid.

In combination (with parentheses which give greater weight to that portion of the prompt), the complete prompt with style, subject, variation, and negative for the opening image is:

Oil painting of 30 year old Croatian man playing piano in cozy room, thick strokes of paint, cinematic lighting, beautiful composition, faded abstract nostalgic memory, impressionistic, wide angle distant shot far, neg–cloned, copies, fancy, text, signature

And for the little girl dancing:

Oil painting of little girl dancing joyfully in cozy room vibrant energy, thick strokes of paint, cinematic lighting, beautiful composition, faded abstract nostalgic memory, impressionistic, little rascals, ((yellow middy blouse)), 1940s style, ((center framed, rule of thirds)), neg–cloned, copies, frame, (text), (signature)

After finalizing the sequence of memories and their prompts, I began processing hundreds of versions of the entire sequence. I took these frames and spliced the best portions together to form one long memory sequence.

(iv) Depicting Sam

The more direct use of AI images to depict Alzheimer’s came from the abstract nature of Sam’s character–in particular the third shot which reveals his character. How much detail to show was a difficult decision. Should he have eyes? A mouth? Should just his face be abstract and his body be 3D? While Sam was a projection of myself in this story, I decided to keep true to the sole perspective on Hope by leaving Sam with no definition at all. He almost blends into the background–a mere reflection of Hope’s own tactile face with a faded quality.

Part 5: Other Aspects of Production

In December and January I cast the voice actors through backstage.com. From 350+ submissions I picked 20-30 to submit a full digital audition. In February, I recorded the audio over Zoom with the actors. I wanted to do this late in the process so I could have visual reference for the actors and more time to workshop the dialogue. The recording process was smooth, albeit awkward to direct purely over Zoom. For Papa’s actor, Rich Greene, we improv’d the scene many times together–slowly finding bits of unpredictability to the dialogue, such as him telling his wife to be quiet or lifting child Hope on the piano chair next to him.

Most of the editing was done early on in the layout phase of animation. The color and audio correcting I passed on to my long-time collaborators and professional artists Jack Wang and Yicheng Zhu. After several rounds of notes with their incredible talent, the film began to take full form. The audio design was one of the most crucial parts of the piece. Yicheng and I talked a lot about how important it was to add layers of memories reverberating through each other, and also emphasize certain moments such as a young Hope laughing. We took a wholly subjective feel to the audio–for instance, the butterfly’s wings flapping would never realistically be heard, but that image emotionally resonated with Hope so it was important to include.

The final creative touch to the piece was the credits. I thought this was a unique opportunity to add to the feeling and story. The images of a butterfly and outdoor field in the background reflects a more hopeful emotion. The added audio that closely ties into the memory sequence allows the audience to relive the memories themselves. In a way, the subliminal images and audio of the credits more accurately queues the audience into the feeling with which Hope is left–one of distant bittersweet memories.

Part 6: Reflections

(i) How Have I Evolved as an Artist?

This thesis has challenged the core aspects of how I approach filmmaking. Nearly every film I previously made would have a story constraint–typically prescribed by a professor–and a more limited timeline. My previous personal projects were based on the desire to explore a technical challenge–from 2D animation, to a certain lighting design, to VFX. I first approached this project largely through a technical lens. What was feasible? What did I want to explore? On a story level I was inspired by creating a good audience experience. While these aspects felt valid at the time, I learned that I really needed to dig deeper inside myself to find a story unique to myself that I could work on at such a large scale.

The largest piece of anxiety in this process was whether the film would be feasible. To do any animated project requires an amount of learning on the job. The process of AI image generation also demanded my flexibility in the current tools I used. Throughout the months of production there were several major advancements in AI image generation–controlNet, Stable Diffusion v2.0, and a plethora of AI video generators–which could have dramatically affected how I approached the memory sequence. I learned that it was important to stay up to date, however I could not let that distract me from the core creative otherwise I would not have been able to execute the entire project.

The creative process of making an animated film meant I was mostly doing it alone on my computer. To receive any help or feedback I needed to reach out to creative collaborators. This dynamic challenged my own confidence as a director. When is the right moment to ask for feedback? What context should an artist give to someone giving notes? Initially in our thesis class I presented my script and said: “any notes are welcome.” This context led the conversation into rabbit holes of technical details and visual clarifications, when I realized that what I really wanted was an emotional conversation. Does the film work on a story level? So for future feedback sessions I opened with this question, but that meant I would receive ten different interpretations on what the story should be. For months I got lost thinking more about other people’s conflicting opinions rather than my own. I learned to find a balance, and especially appreciate those creative collaborators who asked questions first about my intentions.

(ii) Ethics of AI

AI image generation, and AI in general, has a huge cultural implication technologically as well as ethically. Is it okay to benefit off other artists’ years of work for my own purposes over which they have no say? While I did not end up using “directed by Terrence Malick” as a prompt, if I had, should I have credited him? Even if I do not use an artist’s name in a prompt the AI is still trained off of the same artwork, so is it not inherently a form of stealing?

I do not think the inherent use of AI image generation is stealing as long as it is used in a transformative nature and proper credit is given. The legal question of copyright infringement is up to the courts, but these ethical questions must be decided on the social level. When I initially started creating AI images, I heavily relied on artist names to create a strong aesthetic design. Over time I found it less necessary as I knew how to prompt better, yet I still cannot deny the obvious quality difference when using an artist in a prompt. Eventually I opted out of using artists’ names. If I do find it necessary to include an artist name for future projects I would ensure they are properly credited.

(iii) How I Contributed to the Medium of Film

On a creative level, Remembrance is one of the first films that depicts this moment of lucidity for patients with Alzheimer’s. Technically, the film breaks ground using AI image generation for a story driven purpose. It opens the doors for new ways to portray thought and memory in art. The entire animation’s visual aesthetic–painterly style, rough pencil lines, the shifting scenes, utilization of naturalistic camera movement–uniquely adds a tactility that pushes against the trends of realism in animation.

(iv) Did I Characterize Alzheimer’s Properly?

There remains a lingering question in the back of my mind about how proper Remembrance’s depiction of Alzheimer’s/dementia is. Does the bleakness of the real world come across as stereotypically sad? Does the film state that music therapy is the only solace for those afflicted with Alzheimer’s? As the creator, it’s not my place to answer these questions, though I hope the audience answers with a “no.” My intentions were of an honest personal experience I have had attempting to elicit that same response from my Grandmother. I hope I portrayed the bittersweet moments of lucidity accurately. In the end, I can only speak from my own feelings and research, however I hope the film reflects an honest experience and enlightens the audience’s perspective on Alzheimer’s.

Leave a Reply

Discover more from

Subscribe now to keep reading and get access to the full archive.

Continue reading