Here are slides from the talk I gave at 2014 Berlin Buzzwords conference.

# Category Archives: Uncategorized

# MozCon 2013 Ranking Factors importance, expert survey

What catch my attention:

- Almost no mentioning of click factors, traffic signals,
- Language models are recognized by SEOs.

Nothing new, but it is always interesting how SEO guys see search engine internals.

# Visualization: performance of learning of different algortithms on synthetic data sets

Long time no see, if anyone is reading this blog. Few days ago, I found very interesting visualization of how different ML algorithms are performing on synthetic data sets. The study conatins Bayes, KNN, regression, rule- and tree-based, SVM models. No wonder, that SVM with well-tuned RBF kernel gamma and cost parameters is performing very good, the same for tree-based models.

# Поездка по Австрии [russian]

С 13 по 19 апреля я с семьей (родителями и женой) ездил в путешествие по Австрии. Для недельной поездки на машине мы выбрали большой маршрут – в целом, чуть больше 2000 км. В результате мы объехали всю Австрию с запада на восток с заездом на юг.

Continue reading

# Similarity metrics

Posted originally by Sam Chapman, University of Sheffield, Department of Computer Science, NLP Group, United Kingdom. Right now, page isn’t available and I’m publishing it here.

- Hamming distance
- Levenshtein distance
- Needleman-Wunch distance or Sellers Algorithm
- Smith-Waterman distance
- Gotoh Distance or Smith-Waterman-Gotoh distance
- Block distance or L1 distance or City block distance
- Monge Elkan distance
- Jaro distance metric
- Jaro Winkler
- SoundEx distance metric
- Matching Coefficient
- Dice’s Coefficient
- Jaccard Similarity or Jaccard Coefficient or Tanimoto coefficient
- Overlap Coefficient
- Euclidean distance or L2 distance
- Cosine similarity
- Variational distance
- Hellinger distance or Bhattacharyya distance
- Information Radius (Jensen-Shannon divergence)
- Harmonic Mean
- Skew divergence
- Confusion Probability
- Tau
- Fellegi and Sunters (SFS) metric
- TFIDF or TF/IDF
- FastA
- BlastP
- Maximal matches
- q-gram
- Ukkonen Algorithms