Here are slides from the talk I gave at 2014 Berlin Buzzwords conference.
Category Archives: Uncategorized
MozCon 2013 Ranking Factors importance, expert survey
What catch my attention:
- Almost no mentioning of click factors, traffic signals,
- Language models are recognized by SEOs.
Nothing new, but it is always interesting how SEO guys see search engine internals.
Visualization: performance of learning of different algortithms on synthetic data sets

Performance of learning of different algortithms on synthetic data sets
Long time no see, if anyone is reading this blog. Few days ago, I found very interesting visualization of how different ML algorithms are performing on synthetic data sets. The study conatins Bayes, KNN, regression, rule- and tree-based, SVM models. No wonder, that SVM with well-tuned RBF kernel gamma and cost parameters is performing very good, the same for tree-based models.
Поездка по Австрии [russian]
С 13 по 19 апреля я с семьей (родителями и женой) ездил в путешествие по Австрии. Для недельной поездки на машине мы выбрали большой маршрут – в целом, чуть больше 2000 км. В результате мы объехали всю Австрию с запада на восток с заездом на юг.
Continue reading
Similarity metrics
Posted originally by Sam Chapman, University of Sheffield, Department of Computer Science, NLP Group, United Kingdom. Right now, page isn’t available and I’m publishing it here.
- Hamming distance
- Levenshtein distance
- Needleman-Wunch distance or Sellers Algorithm
- Smith-Waterman distance
- Gotoh Distance or Smith-Waterman-Gotoh distance
- Block distance or L1 distance or City block distance
- Monge Elkan distance
- Jaro distance metric
- Jaro Winkler
- SoundEx distance metric
- Matching Coefficient
- Dice’s Coefficient
- Jaccard Similarity or Jaccard Coefficient or Tanimoto coefficient
- Overlap Coefficient
- Euclidean distance or L2 distance
- Cosine similarity
- Variational distance
- Hellinger distance or Bhattacharyya distance
- Information Radius (Jensen-Shannon divergence)
- Harmonic Mean
- Skew divergence
- Confusion Probability
- Tau
- Fellegi and Sunters (SFS) metric
- TFIDF or TF/IDF
- FastA
- BlastP
- Maximal matches
- q-gram
- Ukkonen Algorithms