Rajpura, ParamParamRajpuraAggarwal, AlakhAlakhAggarwalGoyal, ManikManikGoyalGupta, SanchitSanchitGuptaTalukdar, JontiJontiTalukdarBojinov, HristoHristoBojinovHegde, RaviRaviHegde2025-08-302025-08-302018-01-01[9789811300196]10.1007/978-981-13-0020-2_452-s2.0-85046427349https://d8.irins.org/handle/IITG2025/22981We show that finetuning pretrained CNNs entirely on synthetic images is an effective strategy to achieve transfer learning. We apply this strategy for detecting packaged food products clustered in refrigerator scenes. A CNN pretrained on the COCO dataset and fine-tuned with our 4000 synthetic images achieves mean average precision (mAP @ 0.5-IOU) of 52.59 on a test set of real images (150 distinct products as objects of interest and 25 distractor objects) in comparison to a value of 24.15 achieved without such finetuning. The synthetic images were rendered with freely available 3D models with variations in parameters like color, texture and viewpoint without a high emphasis on photorealism. We analyze factors like training data set size, cue variances, 3D model dictionary size and network architecture for their influence on the transfer learning performance. Additionally, training strategies like fine-tuning with selected layers and early stopping which affect transfer learning from synthetic scenes to real scenes were explored. This approach is promising in scenarios where limited training data is available.falseTransfer learning by finetuning pretrained CNNs entirely with synthetic imagesConference Paper517-528201818cpBook Series9