GeistHaus
log in · sign up

https://msapaydin.wordpress.com/feed

rss
10 posts
Polling state
Status active
Last polled May 18, 2026 22:43 UTC
Next poll May 20, 2026 00:41 UTC
Poll interval 86400s
Last-Modified Wed, 13 May 2026 23:56:28 GMT

Posts

Content moderation in Turkish
Uncategorized
Our paper with kariyer.net research team has been published! It is about applying transformer models to moderate Turkish text messages on the online job portal for blue… Read more "Content moderation in Turkish"
Show full content

Our paper with kariyer.net research team has been published! It is about applying transformer models to moderate Turkish text messages on the online job portal for blue collar jobs. The paper is available here.

In summary, it is about Bert Turkish model that we applied to classify real content on kariyer.net platforms.

msapaydin
http://msapaydin.wordpress.com/?p=424
Extensions
On university rankings
Uncategorized
In the US, there are many not so well known universities — shall we call them community colleges?– that target their students to get employed. As such,… Read more "On university rankings"
Show full content

In the US, there are many not so well known universities — shall we call them community colleges?– that target their students to get employed. As such, they follow the employment trends and pay consultant companies to guide them into opening new programs, adjusting syllabi, and guiding students into the professions they should be able to get a job in rather easily.

In Turkey, the universities do not do this. Instead, students flock into software related jobs on their own, even though software defined jobs have a large variety and it is not clear that what they are learning is in demand, in their local industries. For instance, AI is a buzzword that most students try to learn, which admittedly is booming, however even before AI there are many more basic materials that students could capture to get a job quickly, and with good salaries. This is rather done on an ad-hoc way on a student level.

I believe that universities that connect the trends in job sector to the students career counseling can get an edge in attracting better students and raising their profile.

msapaydin
http://msapaydin.wordpress.com/?p=428
Extensions
Xilinx and AI
Uncategorized
I was recently awarded a Xilinx Pynq Z2 chip for AI inference. I have had some chance playing with it. The board contains a linux computer (similar… Read more "Xilinx and AI"
Show full content

I was recently awarded a Xilinx Pynq Z2 chip for AI inference. I have had some chance playing with it. The board contains a linux computer (similar to raspberry pi), and an FPGA. I don’t know FPGA programming yet, however the Xilinx community has some resources similar to python software libraries. With the Xilinx chips one can find a suitable overlay (a bit file to program the FPGA) and make some of the computation on the hardware to accelerate the computation. The Xilinx ecosystem is nowhere near as rich in terms of such overlay files compared to python software libraries, however Xilinx is making an effort to increase the awareness of such chips. The pynq z2 has a jupyter interface which works instantly with my linux (fedora, different from the linux running on Pynq Z2) computer. FPGA programming requires vivado tools which does not support all linux flavors (such as fedora) but it is not a big problem. What I have not yet seen is some interesting FPGA overlays which allow to do deep learning directly on pynq z2, I am still learning and researching. So far I was able to only find an application of image resizing on pynq z2 which does not come with a timing comparison compared to running on CPU. Also image resizing is not terribly complicated.

It is clear that with model reuse sites such as model zoo or Hugging Face, and the computational expense of training such models on CPU, one distinct advantage would come from running these systems in a reasonable amount of time, and FPGAs may be a good (and cheaper?) alternative to Nvidia GPU’s.

Thanks to Xilinx for making a donation to my lab.

msapaydin
http://msapaydin.wordpress.com/?p=404
Extensions
AI and healthcare
Uncategorized
Yesterday I was a panelist in a session entitled “Artificial Intelligence and healthcare”. In order to prepare for the session. I had started reading Eric Topol’s book… Read more "AI and healthcare"
Show full content

Yesterday I was a panelist in a session entitled “Artificial Intelligence and healthcare”. In order to prepare for the session. I had started reading Eric Topol’s book called “The patient will see you now”. It is a good book and has a lot of coverage on various aspects of healthcare starting from Hippocrates and all the way to the printing maching invented by Gutenberg. Topol compares the printing machinery invention to the smartphone, in that after the printing machine was invented, there was a drastic increase in the production and consumption of knowledge. Similarly with smartphones, and their built-in sensors, the ability to collect big data from the patient and to do analyses on this data is increased by orders of magnitude. One usually does not think of smartphones if one comes from a bioinformatics background like myself, but compared to the ubiquitousness of smartphones with respect to the sequencing devices especially in less developed countries such as Turkey, it makes sense to focus first on AI applications with smartphones. There are many such applications that can be global, and it is hard to compete on those with large companies that have big resources. But when it comes to things that cannot be global, such as those based on understanding local language, there is a dire need for local entrepreneurs to contribute to the AI and healthcare ecosystem.

msapaydin
http://msapaydin.wordpress.com/?p=391
Extensions
climate change and AI
Uncategorized
There are growing concerns about global warming, there has been recent forest fires in many regions. AI and other compute intensive technologies such as bitcoin mining accelerate… Read more "climate change and AI"
Show full content

There are growing concerns about global warming, there has been recent forest fires in many regions. AI and other compute intensive technologies such as bitcoin mining accelerate these effects. There are growing interest in how to make the development of pre-trained models (bert, elektra, gpt-3 and the like) less compute intensive, Stanford has announced a multidisciplinary research group focusing on studying various aspects of what they call “foundation models”.

Climatechange.ai is a portal that has an active research grant call, in which they are inviting researchers to develop datasets and deployments to mitigate the climate change using AI. Although not explicitly specified, the reason why there is so much global warming is also due to many food delivery and car sharing apps. Better algorithms designed to reduce the greenhouse effects, if deployed in such sites, could also contribute to reducing global warming.

msapaydin
http://msapaydin.wordpress.com/?p=380
Extensions
data and ethics
Uncategorized
For the last class of natural language processing, I made a presentation on ethical aspects of AI. Many people think that they are not doing anything illegal… Read more "data and ethics"
Show full content

For the last class of natural language processing, I made a presentation on ethical aspects of AI. Many people think that they are not doing anything illegal so it is OK that some people access their data. As a person who tries to avoid Facebook, whatsapp, and the like, I usually found it difficult to address such comments. However I found a talk from 2017 that talks about data and its effects on privacy in a project called social cooling. It essentially says that a lot can be deduced about you if you use these services, not from your data, but from your “derived” data, such as the metadata, or the set of clicks you make on a site such as YouTube. As a result, this data can be sold to health insurance companies, to banks, or to employment agencies, causing them to know a lot more about you than you would like, thus resulting in denial of loans, more expensive insurance policies, or rejection of job applications. So privacy is a lot more than doing things that are legal or illegal. In terms of “socialcooling.com”, it is about having the freedom to be imperfect, as all humans are.

msapaydin
http://msapaydin.wordpress.com/?p=362
Extensions
filter bubble
Uncategorized
As we have access to more information, the level of uninformed-ness increases. I spoke to two people recently, one believing in homeopathy, that 5G wireless will help… Read more "filter bubble"
Show full content

As we have access to more information, the level of uninformed-ness increases.

I spoke to two people recently, one believing in homeopathy, that 5G wireless will help control people through magnetic waves.

Another believed that covid was known in 2011, that Google co-founder invested in covid remedies 5 years ago, along with Elon Musk.

This is called filter bubble. As we receive our information increasingly from facebook and other AI based recommendation systems, we get misinformed.

There is a very good ted talk about this.

My point is, I find this also in university teaching. Professors who try to teach too much advanced material to students inevitably cause those students to miss even the most basic material of the class, as they get inundated with superfluous material.

Many “bad” textbooks also do the same.

That is why I find it very invaluable to find good resources to learn the material from.

msapaydin
http://msapaydin.wordpress.com/?p=353
Extensions
Best practices for getting up to speed on AI
Uncategorized
fast.ai v2. I find that it is making much easier to proceed with many different applications of deep learning, with freely available lecture videos on youtube and… Read more "Best practices for getting up to speed on AI"
Show full content
  1. fast.ai v2. I find that it is making much easier to proceed with many different applications of deep learning, with freely available lecture videos on youtube and their website. Based on pytorch.
  2. Huggingface transformers for natural language processing applications. Makes it much easier to do otherwise quite laborious tasks, such as sentiment analysis, summarization, question answering, in many languages including Turkish. Enables transfer learning.
  3. Colab for development, making it much simpler than dealing with conda package manager, driver installation, other less well known alternatives.
  4. Subscribing to the the batch newsletter by Andrew Ng. A pretty good summary of the latest on deep learning.
  5. In the past, I have also used textbooks called “python machine learning” by S. Raschka and “deep learning with keras and tensor flow” by A. Géron, which are still very good resources, with jupyter notebooks freely available on github.
  6. Spotify twiml podcast.

(The above is a list of resources I wish someone told me if I were to start working on this field today, addressed mostly to potential students.)

msapaydin
http://msapaydin.wordpress.com/?p=346
Extensions
Practice variations amongst different organizations
Uncategorized
Walking on a major street in Istanbul. The hamburger vendor’s frontyard is full of attendants waiting for their burger once they paid, with a ticket number in… Read more "Practice variations amongst different organizations"
Show full content

Walking on a major street in Istanbul. The hamburger vendor’s frontyard is full of attendants waiting for their burger once they paid, with a ticket number in their hand. They are not waiting outside to get in and do their thing, they are handled quickly, no queue cutting takes place unlike most places in Middle East.

vs.

a major private bank. People queue outside, forming various shapes on the sidewalk, before being able to enter in into the bank and getting a ticket number and being served by one of the bank operators.

They both operate in the same corner of Istanbul, they are both for-profit institutions, one has figured out that it is better to serve the customers better, the other does not care even though it is one of the major private banks of Turkey.

Some of this kind of stark difference also exists between academia and industry — industry adopting best practices much more quickly and being much more nimble and academia being inundated by trifles, bureaucracy, and paperwork. Again a cross-institutional difference, at a high level.

Just some observations from daily life in Turkey as an academic and a practitioner.

msapaydin
http://msapaydin.wordpress.com/?p=336
Extensions
On sentence pair classification task in Turkish
Uncategorized
I am examining the sentence pairs classification dataset [1] automatically translated by a group of researchers from English to Turkish [2]. They say in their paper that… Read more "On sentence pair classification task in Turkish"
Show full content

I am examining the sentence pairs classification dataset [1] automatically translated by a group of researchers from English to Turkish [2]. They say in their paper that they have manually examined the translations and that they were fine, I am not sure I agree. They say they have examined 1K out of 570K (<0.2%) of sentence pairs manually, and furthermore when I examine the sentence pairs in Turkish, as a native Turkish speaker, I have difficulty understanding what the translations mean and what the corresponding label might be. Even though it is very nice of them to share these translated datasets publicly, I find that their conclusions are perhaps a bit premature (with respect to using these translated datasets to train for the sentence semantic comparison tasks). Perhaps this is a general problem for a resource constrained language, as the translations do not work very well (resulting in “chicken translate”), and it may be better to use Sentence Transformer multilingual pack directly [3] rather than fine tuning the Bert model on the translated MNLI-TR dataset.

[1] https://github.com/boun-tabi/NLI-TR

[2] https://arxiv.org/abs/2004.14963

[3] https://github.com/UKPLab/sentence-transformers

sample #1

PS. sample training data (from [1], in Turkish):

“genre”: “government”,
“gold_label”: “contradiction”,
“pairID”: “102489c”,
“promptID”: “102489”,
“sentence1”: “Endişelerinizden bahsetmek istemem ama sizin yerinizde olsaydım, bu 1 doların yakın vadeli fiyat sonuçları hakkında daha fazla endişe duyabilirdim.”,
“sentence2”: “Sizin sorunlarınız hakkında, yakın vadeli oranlardan daha çok endişeliyim.”

sample #2

“genre”: “slate”,
“gold_label”: “entailment”,
“pairID”: “133002e”,
“promptID”: “133002”,
“sentence1”: “Shesol’un atıfta bulunduğu ancak nispeten incelenmeden geçmesine izin verdiği olağanüstü bir istatistik alın.”,
“sentence2”: “Çok alakalı ama kullanılmakta olan veriler vardı.”
},

sample #3

“genre”: “telephone”,
“gold_label”: “entailment”,
“pairID”: “101457e”,
“promptID”: “101457”,
“sentence1”: “Mevsim boyunca ve sanırım senin seviyendeyken onları bir sonraki seviyeye düşürürsün. Eğer ebeveyn takımını çağırmaya karar verirlerse Braves üçlü A’dan birini çağırmaya karar verirlerse çifte bir adam onun yerine geçmeye gider ve bekar bir adam gelir.”,
“sentence2”: “Eğer insanlar hatırlarsa, bir sonraki seviyeye düşersin.”
},

msapaydin
http://msapaydin.wordpress.com/?p=273
Extensions