SkinScan: what it actually takes to ship an AI-powered iOS app
Shipping an AI product to real users is harder than training the model. Here is everything I ran into building SkinScan, from camera capture to HIPAA paper trails.
SkinScan: what it actually takes to ship an AI-powered iOS app
SkinScan is an iOS app that uses a computer vision model to analyze skin conditions from photos. It is in TestFlight with a closed beta group right now. And building it has been one of the more humbling experiences I have had as a developer.
Training the model is the easy part. Shipping it to real people on real phones is where things get complicated.
the stack
SwiftUI for the UI. AVFoundation for camera capture. Go on the backend, because image processing pipelines in Python are slow and memory-hungry, and Go handles concurrent uploads cleanly with a much smaller memory footprint. The AI model is a custom segmentation model trained on roughly 12,000 labeled images, running on a rented A100.
Storage is encrypted at rest. Images auto-delete after 30 days. HIPAA compliance is not optional when you are dealing with anything that could be considered health data.
the camera integration
AVFoundation in SwiftUI is actually pretty good once you know what you are doing. AVCapturePhotoOutput with depth data enabled gives you high-quality captures the model can work with. The tricky part is the capture pipeline: you want to guide the user to hold the phone at the right distance and angle, have good lighting, and stay still long enough to get a sharp image.
We went through three different UI approaches for this. First was a simple circle overlay. Second was a live quality score in the corner. Third, which we ship now, is a combination: a framing guide plus a confidence indicator that only lets the user capture when the image quality is high enough. The forced quality gate annoyed some users but cut model error rates significantly.
the model
89% mAP on our validation set. That sounds good until you think about what the 11% failure mode looks like in a health-adjacent product. Skin lesion classification is genuinely hard. Lighting variation is massive. Skin tone diversity matters a lot for model performance. And benign vs malignant can look almost identical to a computer vision model trained on a limited dataset.
We are extremely clear in the app that this is not a diagnostic tool. It is a screening aid. Every result screen has this language prominent, not buried in a terms-of-service footer. The legal review made that non-negotiable, but I think it is the right call anyway.
the pipeline latency problem
From camera tap to result on screen takes about 12 seconds right now. That is too slow. The model inference itself is only about 300ms. The bottleneck is image preprocessing and network transfer.
The fix I am working on is moving preprocessing to the device. CoreML can run the preprocessing steps natively, and the compressed feature representation is much smaller than the raw image. This should cut end-to-end latency to somewhere around 3-4 seconds, which feels acceptable for a medical-adjacent use case where you are not expecting instant results.
HIPAA is a paper trail problem, not a technical one
The technical side of HIPAA is fine. Encrypt everything (AES-256 at rest, TLS 1.3 in transit). Audit logs. Role-based access. Automatic deletion. None of that is hard to implement if you plan for it from the start.
The documentation side is painful. Every design decision needs a written justification. Every data access needs a logged reason. Every bug needs a postmortem with a root cause and remediation plan. This is not bad practice, it is just slower than I expected. Budget twice as long as you think for anything touching healthcare data.
TestFlight feedback is brutal and useful
Beta testers will find every edge case. Low light crashes (fixed). Portrait vs landscape orientation mismatch causing wrong crops (fixed). Phone case shadow on the lesion causing misclassification (still working on it, data augmentation is the likely fix). Someone with extremely oily skin getting reflections that the model interprets as lesion borders (not seen this in training data at all).
The crash-free rate was around 94% in the first beta build. It is at 99.2% now after three patch releases. Real-world usage surfaces things that even a thorough test matrix misses.
what I would do differently
Start the HIPAA documentation from day one, not as a pre-launch sprint. Plan for on-device preprocessing from the architecture phase. And run a closed alpha with 10 users before opening to a broader beta group. The first build had basic UX problems that a small internal group would have caught immediately.
The technical foundation is solid. The hard work now is on model improvement and making the pipeline fast enough that it does not feel like waiting for a page to load.