# Sharpened Cosine Similarity: Part 2

A lot has happened since my last post on the Sharpened Cosine Similarity layer. In this post I will try to give you a quick overview over the most important developments around this feature extractor that shapes up to be more and more interesting.

# Sharpened Cosine Distance as an Alternative for Convolutions

Some days ago Brandon Rohrer retweeted his own twitter thread from 2020 in which he makes the argument that convolutions are actually pretty bad at extracting features. In it he proposes a method to improve feature extraction that seemed compelling to me.
The formula for this Sharpened Cosine Distance is the following:

$$scd(s, k) = sign(s \cdot k)\Biggl(\frac{s \cdot k}{(\Vert{s}\Vert+q)(\Vert{k}\Vert+q)}\Biggr)^p$$

I decided to try this idea out and created a neural network layer based on this formula and as it turns out it actually works!

# Bringing CLIP to the Italian language with Jax and Hugging Face

CLIP is a model published by OpenAI that is able to learn visual concepts by natural language supervision. It does this by embedding images and their corresponding caption into a joint space and contrastively minimizing their distance. OpenAI only published weights for CLIP trained on english data. That's why during the JAX/Flax community event organized by Hugging Face and Google, we from the clip-italian team wanted to try to train a CLIP version that understands 🤌Italian🤌.

# Imax: Making Image Augmentations fast with JAX

Image augmentations make all the difference when working with neural networks. Everybody should know that by now. No matter what you're trying to train, if it involves images you should be using heavy and fancy augmentations! The only downside of these heavy augmentations is that they might slow down your training significantly if they are not implemented in a fast and efficient way. With Imax the goal was to solve that while getting better at Jax.
And you can try the results today!

# JUDO-Net (Extended Edition)

Since my paper on "Joint Unsupervised Depth-Estimation and Obstacle-Detection" did not get accepted to NeuRIPS 2019 I now had another unpublished paper lying around. Back then more and more people around me started to get interested in neural networks and some (including my mom😂) were also interested in my work. I however, kept struggling trying to explain to them what exactly it was I was actually doing.
So I had an idea: What if I could start explaining the concepts of neural networks at a relatively low level and explain my way up from there?

# BBoxr: A simple Tool to collect Bounding Boxes

I made a thing! It's called BBoxr and it's a web app to collect images and bounding-box information without installing anything. It works for desktops and phones. You can play with it here: www.rpi-bboxr.web.app

# Hackathon: Diagnosing COVID-19 with X-Rays and Transfer Learning

In the early days of COVID-19, before there were cheap PCR and Antigen-Tests, one way to diagnose it was through a CT-Scan. The obvious drawbacks of this were of course that CT-Scans are pretty expensive, they are not equally available in all parts of the world and that they deliver quite a high dose of radiation.

In March 2020 Esther Schaiter and me heard about a 24hour online hackathon with the goal to surface ideas to combat problems related to COVID-19. Since Esther was a final year medical student she was very close to the issue. Therefor we decided to tackle the idea of diagnosing COVID with X-rays.

# Joint Unsupervised Depth-Estimation and Obstacle-Detection

Inspired by J-MOD² by Mancini et al. and my previous paper on depth estimation I wanted to go back and re-join depth estimation and obstacle detection in a single network. In the process I not only beat the SOTA for unsupervised monocular depth estimation but also introduced some really cool loss functions and a way to train a model to segment obstacles without ground truth data. Fun stuff, I promise!

# Internship Report 2018 (3/3)

#### 3D Reconstruction from Stereo Images

The last internship project I’ll tell you about didn’t result in a satisfactory result in the limited time available, but at the same time it was the one that I definitely found most interesting for reasons I will explain below. While 3D reconstruction is a really interesting topic in general, it is essential for companies like Microtec because in order to cut logs into the required planks in the most efficient way possible.

# Internship Report 2018 (2/3)

#### Plank Segmentation Model

The next project I’m going to tell you about is a model to segment wooden planks inside of machines or an industrial environment. This information can be used to improve many control problems in sawmill machines that currently have to be addressed with traditional sensors. For this model I would have liked to implement an oriented version of YOLO v3 but due to time constraints in the end I essentially used the same approach as for the log segmentation.

# Internship Report 2018 (1/3)

This summer I did an internship at Microtec, one of the leading providers of wood-scanning solutions for sawmills, and an emerging power in the domain of food scanning. In the next few posts I’m going to tell you about some of the projects I did there. I was hired with the goal to apply my experience in machine-learning, and especially deep-learning, in order to solve some of the tasks that the company is currently facing.