Published February 3, 2025 12 min read

Apple’s ARKit vs. Eye Fatigue

by Yurij Gapon

Head of iOS @ It-Jim

In today’s world, digital devices dominate our daily lives, with significant time spent in front of screens – computers, smartphones, tablets, etc. While this lifestyle is an inevitable part of modern life, it also places substantial strain on our eyes. For many, eye fatigue has become a routine part of life, and if ignored, it can result in serious health issues. Key symptoms of the problem are: poor sleep, light sensitivity, reduced productivity

Obviously when having respective symptoms one should, first and foremost, reduce the screentime. However this is not always possible. Another way is to do Eye Exercises. An application which guides a person through a set of exercises would be beneficial. And that’s what we’re going to create today.

Eyes tracking

Key feature of an eye training app would be eye tracking. Eye movement tracking helps accurately assess exercise completion and ensures appropriate feedback for the user.

To implement the eye tracking function, we compared several potential solutions:

Tracking Type	Vision	MLKit	ARKit
Process Time*	±7.3 ms	±14.25 ms	±8.6 ms
Output Data Type	2D	2D	3D
Individual Pupil Tracking	–	–	+
Setup Code	Small	Many	Small
Guides and Tutorials	Many	A little	Many
Multiplatform	–	+	–

* – 1080p 60 fps iPhone 14 Pro, Front Camera, median

Vision Framework: Provides extensive capabilities for 2D face tracking and keypoint detection, such as eye tracking. However, its accuracy and functionality when working with pupils are limited compared to ARKit.
Google ML Kit: A cross-platform solution with basic face and eye area tracking capabilities. The main drawbacks include slower frame processing on iOS compared to native tools and challenges in working with pupil tracking.
ARKit (ARFaceTracking): An Apple platform offering powerful tools for eye tracking in a 3D space. ARKit delivers precise data through the use of the TrueDepth camera and provides the best native implementation for pupil tracking.

Currently, there is no requirement for cross-platform implementation, as our focus is solely on iOS, where frame processing speed is critical. Additionally, ARKit’s output in a 3D format offers a more advanced implementation, providing deeper visualization options, better customization, and a more comprehensive picture of user actions.

Based on the above considerations, we have chosen ARKit (ARFaceTracking) to implement the eye tracking service.

First, we will define the ARSessionManager protocol and data models for processing results.

We will create the EyeTrackingData model to store data about the position of each eye in all expected states, enabling us to process the results from ARFaceAnchor and retain them:

final class EyeTrackingData {
    // MARK: - Properties
    var eyeLookInLeft: Float
    var eyeLookOutLeft: Float
    var eyeLookInRight: Float
    var eyeLookOutRight: Float
    var eyeLookUpLeft: Float
    var eyeLookDownLeft: Float
    var eyeLookUpRight: Float
    var eyeLookDownRight: Float
    var eyeBlinkLeft: Float
    var eyeBlinkRight: Float
    var eyeWideLeft: Float
    var eyeWideRight: Float
    
    // MARK: - Init
    init(...) { ... }
}

Now let’s describe the ARSessionManager protocol and the ARSessionManagerDelegate delegate, which will return the results for further use:

protocol ARSessionManager: AnyObject {
    // MARK: - Funcs
    func setDelegate(_ delegate: ARSessionManagerDelegate)
    func setupSession() -&amp;amp;amp;amp;gt; ARSCNView
    func startSession()
    func pauseSession()
}

protocol ARSessionManagerDelegate: AnyObject {
    func didUpdateEyeTrackingData(_ data: EyeTrackingData)
}

When implementing ARSessionManager, it is important to consider the following configurations:

Using arSessionQueue to isolate the service’s operation queue from the UI, preventing interface blocking;
Using ARFaceTrackingConfiguration to explicitly specify the type of tracking.


final class ARSessionManagerImpl: NSObject, ARSessionManager {
    // MARK: - Delegate
    private var delegate: ARSessionManagerDelegate?
    
    // MARK: - Properties
    private var configurations: ARConfiguration?
    private let arSessionQueue = DispatchQueue(
        label: "ar-session-queue",
        qos: .userInitiated,
        attributes: [],
        autoreleaseFrequency: .workItem
    )
    
    // MARK: - ARSceneView
    private var sceneARView = ARSCNView()
    
    // MARK: - Set
    func setDelegate(_ delegate: ARSessionManagerDelegate) {
        self.delegate = delegate
    }
    
    func setupSession() -&amp;amp;amp;amp;gt; ARSCNView {
        configurations = ARFaceTrackingConfiguration()
        sceneARView.delegate = self
        return sceneARView
    }
}

The methods startSession() and pauseSession() are provided for session management:

// MARK: - Controls
extension ARSessionManagerImpl {
    func startSession() {
        arSessionQueue.async {
            guard let config = self.configurations else { return }
            self.sceneARView.session.run(config, options: [
                .resetTracking, .removeExistingAnchors
            ])
        }
    }
    
    func pauseSession() {
        arSessionQueue.async {
            self.sceneARView.session.pause()
        }
    }
}

To accomplish the primary function – tracking the user’s eye state and transmitting the relevant data – it is necessary to utilize the appropriate method from ARSCNViewDelegate. This method enables the retrieval of ARFaceAnchor and the associated data set, ensuring accurate and efficient processing of the required information.

One of the key components returned by ARFaceAnchor is blendShapes. These are a set of parameters that describe specific facial positions and states, such as blinking, eye movements, or changes in mouth shape. Each of these positions is represented as a numeric value ranging from 0.0 to 1.0, indicating the intensity of a particular action or position.

BlendShapes are crucial for accurately determining the user’s eye state. For instance, the parameters eyeBlinkLeft and eyeBlinkRight indicate the blinking level of the left and right eyes, while eyeLookUpLeft or eyeLookOutRight show the gaze direction. Apple provides visualizations and documentation for these parameters, which greatly simplifies their integration into application development.

Eye Blinking

// MARK: - ARSCNViewDelegate
extension ARSessionManagerImpl: ARSCNViewDelegate {
    func renderer(
        _ renderer: SCNSceneRenderer,
        didUpdate node: SCNNode,
        for anchor: ARAnchor
    ) {
        guard let faceAnchor = anchor as? ARFaceAnchor else { return }
        let blendShapes = faceAnchor.blendShapes
        
        let eyeTrackingData = EyeTrackingData(
            eyeLookInLeft: blendShapes[.eyeLookInLeft]?.floatValue,
            eyeLookOutLeft: blendShapes[.eyeLookOutLeft]?.floatValue,
            eyeLookInRight: blendShapes[.eyeLookInRight]?.floatValue,
            eyeLookOutRight: blendShapes[.eyeLookOutRight]?.floatValue,
            eyeLookUpLeft: blendShapes[.eyeLookUpLeft]?.floatValue,
            eyeLookDownLeft: blendShapes[.eyeLookDownLeft]?.floatValue,
            eyeLookUpRight: blendShapes[.eyeLookUpRight]?.floatValue,
            eyeLookDownRight: blendShapes[.eyeLookDownRight]?.floatValue,
            eyeBlinkLeft: blendShapes[.eyeBlinkLeft]?.floatValue,
            eyeBlinkRight: blendShapes[.eyeBlinkRight]?.floatValue,
            eyeWideLeft: blendShapes[.eyeWideLeft]?.floatValue,
            eyeWideRight: blendShapes[.eyeWideRight]?.floatValue
        )
        
        delegate?.didUpdateEyeTrackingData(eyeTrackingData)
    }
}

We have created the EyeTrackingData model and defined the complete logic for ARSessionManager, which works with ARFaceTrackingConfiguration and provides the expected data. Now, we will focus on implementing the service that will process the results and determine whether the selected exercises have been completed.

To begin, it is necessary to create appropriate working models to describe the exercises and the criteria for their completion, such as eye positions. In our case, exercises will define the direction of the gaze relative to the center, meaning that the exercise name and the eye position can match:

enum EyeExercise: CaseIterable {
    case right
    case left
    case up
    case down
    case topLeft
    case topRight
    case bottomLeft
    case bottomRight
    case blink
}

Next, we need to define the criteria for the ExerciseService, i.e., its protocol. In our case, it will have combined functionality, meaning it will both create the training sequence and verify whether the current exercise is completed, then switch to the next one.


protocol ExerciseService {
    func regenerateExercises(type: TrainingSetType)
    func isCurrentExerciseCompleted(
        inputData: EyeTrackingData,
        user: UserData?
    ) -&amp;amp;gt; Bool
}

The implementation of the isCurrentExerciseCompleted() method is critical to the functionality of our app, as this method determines whether the current exercise has been successfully completed:

func isCurrentExerciseCompleted(
    inputData: EyeTrackingData,
    user: UserData?
) -&amp;gt; Bool {
    /// We’ll check the input data value of each eye separately and determine
    /// its position to make sure that the exercise is being completed.
    /// For blinks, we will check whether the eyes were closed
    /// (i.e., no pupils are visible)
}

In our specific case, we employ the MVP architectural pattern, where data from ARSessionManager is returned via a delegate to the Presenter. In the Presenter, the data is processed using the ExerciseService class, which is responsible for structuring the training sequence and verifying the completion of the current exercise. These results are then processed to provide the user with appropriate feedback.

Calibration: A Crucial Step

Before a user begins using the app regularly, it is critical to perform a calibration process. Each individual is unique, with different eye positions, varying limits on rotation and movement, varying eye depth in the skull, and other physiological differences.

To ensure the comprehensive and high-quality functionality of our app, we must include a dedicated calibration feature. This involves creating a specific training sequence — a set of exercises that accounts for a maximum number of positions and states.

Additionally, an informational Best Practices screen should be implemented to educate and guide the user effectively.

At the end of the calibration (as with every workout), it’s worth adding a rewards screen to highlight the end of the workout and give the user a sense of accomplishment.

Best Calibration Tips

To achieve this, we will proceed with the following steps:

Perform two cycles of EyeExercise with a pause of 5-10 seconds between each exercise. This will allow us to determine typical eye deviations and their positions for each exercise.
Save these results in the corresponding values of UserData with a coefficient of 0.8. This adjustment will account for the natural imperfections in human movements and the variability of results.

Eye Tracking 1

Eye Tracking 2

And after this user is guided to do a set of various exercises where they have to move their eyes in all directions.

More about application

Data Input Form and Its Purpose

For personalized user interaction and efficient data storage and management, we utilize Apple’s CoreData framework. This allows for seamless operation with a local database and offers flexibility in handling data.

We create a UserData models to store essential user information and its child entities to manage and track exercises (look at relationship diagram bellow):

Data Input Schemes

During the initial setup (onboarding), the user is prompted to enter the following information:

Working hours: Start time and duration of the workday spent at the computer;
Working days: The days of the week when the user is actively working.

This data is essential for personalizing notifications to align with the user’s work schedule and ensure they are not intrusive during non-working hours.

Notifications

Regularity of breaks and exercises is really important. So a simple function like scheduled reminders throughout the day is a must.

To handle notification creation and management, we first define a protocol NotificationService, where we outline the required functionality:

protocol NotificationService: AnyObject {
    func scheduleNotifications(user: UserData, timeReminder: Int)
    func rescheduleNotifications(user: UserData)
}

Next, we will implement the methods scheduleNotifications() and rescheduleNotifications(), which will handle creating notifications based on the user’s onboarding questionnaire and updating them if the user completes eye exercises between reminders.

func scheduleNotifications(
    user: UserData,
    timeReminder: Int   /// numbers of hours between notifications
) { 
    let workingHours = Int(user.workingTime)
    let startHour = Calendar.current.component(.hour, from: lastWorkout)
    UNUserNotificationCenter.current().removeAllPendingNotificationRequests()
    
    for day in workDays {
        for hour in stride(
            from: startHour + timeReminder,
            to: startHour + workingHours,
            by: timeReminder
        ) {
            addNotification(day: day, hour: hour, lastWorkout: lastWorkout)
        }
    }
}

A private method addNotification() has been added to create a request. This method provides the context and trigger for the notification and adds it to the general notification pool.

private func addNotification(day: Int, hour: Int, lastWorkout: Date) {
    var dateComponents = DateComponents()
    dateComponents.weekday = day
    dateComponents.hour = hour
    
    if let notificationDate = Calendar.current.nextDate(
        after: lastWorkout,
        matching: dateComponents,
        matchingPolicy: .nextTime
    ) {
        /// Set notification content
        let content = UNMutableNotificationContent()
        content.title = Strings.NotificationService.title
        content.body = Strings.NotificationService.body
        
        /// Set notification trigger
        let trigger = UNCalendarNotificationTrigger(
            dateMatching: Calendar.current.dateComponents(
                [.year, .month, .day, .hour, .minute, .second],
                from: notificationDate
            ),
            repeats: false
        )
        
        let request = UNNotificationRequest(
            identifier: UUID().uuidString,
            content: content,
            trigger: trigger
        )
        
        UNUserNotificationCenter.current().add(request) { (error) in
            if let error = error {
                /// handling the error
            }
        }
    }
}

The implementation of rescheduleNotifications() remains similar, with the consideration that current notifications will be recreated for the remainder of the workday.

For example, if a user works from 9:00 AM to 5:00 PM with a reminder interval of every 2 hours, notifications will be sent at 11:00 AM, 1:00 PM, and 3:00 PM. Notifications will not be sent during non-working hours or days, ensuring they are non-intrusive and aligned with the user’s personal schedule.

Colors

Last but not the least is the UI color scheme. User interface design and user experience are critical for eye health applications, as the right color scheme can reduce eye strain and enhance user perception (DevTo). UI colors for the app were chosen based on the principles of color psychology and their impact on users (MockFlow, HappyDesign).

Eye Tracking App Colors

Conclusion

In today’s world, digital devices dominate our lives, yet we often overlook the long-term impact of prolonged screen time on our eyes. Symptoms like migraines, disrupted sleep, light sensitivity, and reduced productivity may begin subtly but can escalate into significant health issues. Apps like ours aim to address these challenges proactively, promoting better eye health and well-being. Eye Tracking Meme

Building an app to combat eye fatigue requires more than technical expertise; it demands thoughtful design. Eye-tracking technology must balance performance, accuracy, and platform compatibility for seamless integration. Equally vital is the user experience – interfaces should reduce eye strain with adaptive color schemes and feel intuitive to use. Notifications play a key role in encouraging regular breaks, fostering healthier habits.

Challenges remain, such as hardware limitations (e.g., TrueDepth camera availability) and the need for robust onboarding and calibration processes to personalize the experience. User education is also critical, ensuring awareness of the importance of eye care and exercises.

Our app leverages ARKit with ARFaceTracking for precise, efficient three-dimensional eye tracking. The ARSessionManager isolates session handling, ensuring smooth data flow to the Presenter, where exercises are monitored. Adaptive color schemes reduce strain, while smart notifications remind users to take breaks, tailored to their schedules.

This demonstrates how technology can address real-world health issues. However, opportunities abound – whether through integrating third-party platforms or enhancing functionality with machine learning for greater precision and personalization.

How would you implement eye tracking in your app?

Perhaps it’s time to explore the possibilities that machine learning could bring to the table. After all, the future of eye tracking is only limited by the scope of our imagination.

Post Views: 3,074

Apple ARKitarkiteyes tracking

Ready to Make Your Business Processes Up to 90% More Efficient?

Partner with a team that builds AI to work in the real business world. We help companies cut manual work, speed up operations, and turn complexity into clarity.