Description
Increasingly, image and video data are not only evaluated by a human viewer, but instead by machines using algorithms. In many scenarios, such as smart city systems, the computing power at the recording device is too low to perform the evaluation locally. Instead, the image data is tran...
Description
Detecting objects in front of a self-driving vehicle is critical for safety but poses significant challenges. The range of potential objects on the road includes rare and unknown entities—such as wild animals, debris, or litter—which are underrepresented in existing datasets. The wide v...
Description
Thermographic data recorded by thermal imaging cameras offers a valuable opportunity to
monitor and analyze the thermal properties of people, objects and buildings. Compared to
conventional video data, thermographic data differs in its special representation of heat
distributions,...
Image quality and resolution are becoming increasingly important in a numerous imaging industries as well as in the consumer sector, which is also reflected in the demand for high-resolution camera sensors. Up to now, the number of physical pixels has usually been increased in order to achieve a better resolution. However, this is only possible to a limited extent due to the photometric limit.
However, it has been shown that the resolution of images can alternatively be improved by new, non-regular sensor layouts without increasing the number of physical pixels...
The need to transmit screen content data is more important than ever. Online conferencing, screen sharing and remote desktop are just a few applications where screen content data has to be compressed. However, conventional image and video codecs are mainly...
Automatic extraction and interpretation of information from documents is crucial in many fields, ranging from financial to medical. It streamlines tasks, enhances decision-making, and also improves accessibility for individuals with visual impairments. Visual Document Understanding (VDU) offers a solution by automating this process, making it more efficient and enabling quick retrieval of relevant data.
Recent advances in computation power along with the availability of vast amount of data is a key enabler towardstraining the next generation of large deep neural networks comprising billions of learnable parameters. This category of networks is commonly known as foundation models. Foundation models ...
Description
Lower limb prostheses support individuals with lower limb amputations in daily tasks, including walking, stairs climbing or running. For adaptation to the current gait situation, it is beneficial to sense the surrounding environment, e.g., with visual sensors. The obtained situation awa...
Point Clouds are becoming one of the most common data structures to represent 3D scenes as it enables six degrees of freedom (6DoF) viewing experience. Point Cloud Data has been used in many applications, from immersive media, and autonomous driving to healthcare.
However, a typical...