The task of image recognition is generally understood to be the task of classifying images, i.e. the task where the image as a whole is assigned into one (or, less often, into several) of a number of classes.
The task is distinct from and not to be confused with that of visual object detection, where objects are detected in an image and their positions are returned, or the task of visual semantic segmentation, where predictions come as pixel-level masks.