You are currently on IBM Systems Media’s archival website. Click here to view our new website.

POWER > Trends > IBM Research

5in5 Sight: IBM Research Is Working on Cognitive Systems That Can Glean Information From Images

Illustration courtesy of IBM

Most people are familiar with the glowing red eye of HAL 9000 from “2001: A Space Odyssey.” Its ever-watchful stare unnerved more than a few moviegoers—and created a still-present fear of sentient computer systems run amok.

Although we haven’t reached that stage yet—and we’re well past 2001—research in computer-based sight is underway. Not to worry, though: This technology won’t be locking the pod bay doors on you.

Rather, it will be used to help medical and security experts, for example, keep an eye out for things humans might otherwise miss, such as a dark spot on an X-ray. And if IBM researcher John R. Smith, senior manager of IBM’s Intelligent Information Management Department, has his way, the type of cognitive system he describes may even be capable of determining which artist painted a particular picture.

Q. You’ve used some interesting terms to describe how humans process sight, such as edge information and texture characteristics. Could you describe those in more detail?
Both of those terms fit into a category we refer to as features. This is an abstraction at a higher level than just the raw data. When an image comes in—let’s say an image from a camera or on a screen—it’s represented through pixels. These are just the individual points in the image.

Typically, to begin to make sense out of those pictures—the pixels—we need to have a higher level of representation, and from the machine-learning point of view, almost an infinite variety of things represent images. For example, they can be features that capture texture-type characteristics. You can think of texture as to whether something has a checkerboard pattern or if something in an image is rough or smooth or has spatial variation. Other features are more focused on color, such as color distributions and dominant regions of color.

Edges are interesting because they give us a hint as to the boundaries in an image. These edges can tell us where objects are and if they’re separate from the background and so on. These are useful abstractions that go beyond the pixels that give us a basis both for the human visual system and also for a computer, to then try to discriminate and detect and reason about the content of the images.

Q. Is this giving images context based on features?
Yes. Sometimes, you may hear about a descriptor, which is a machine representation of a feature. Texture can be the feature, but you can have 10 different descriptors that each try to best capture the texture content of an image. It’s still an area subject to a lot of experimentation.

Jim Utsler, IBM Systems Magazine senior writer, has been covering the technology field for more than a decade. Jim can be reached at



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store