I have finally managed to start solving the captcha levels (I am now at lvl 4). I have a few ideas and, as I have read other’s suggestions, a neural network for symbol recognition or vector-space search engine should do the work. As I am on my way of learning how to implement a neural net, in the meantime, I would like to discuss a few ideas which I believe could be of help in solving the level, without a neural net (I am trying to implement them as well, but I thought that, by bouncing them with the community, I can learn why they are viable or not for this problem).
[*] Using the algorithm from captcha3.
My idea is that we can use the same algorithm. We should already have a script (or at least I think we should) that can crop images and extract emojis, one by one, from the captcha picture. The only difference is that we would have to convert the images to gray-scale or at least convert them to a similar color as well as the “training set”. After that, compute the hash values for the training set and then, compute the hash values for the emojis extracted (for all the 359 rotation possibilities until we find a match). So far, I was unsuccessful because of the way I am rotating the images (the background becomes transparent instead of black). I will further work on this idea. Do you think this idea is doable or you suggest that I should implement it further and see for myself why it is or not doable?
[*] Use a similar idea as the idea before, but play with the hash function.
The only difference between this idea and the previous one, is that maybe, we will not need to rotate the image, but instead, find a hash function that works using the Frequency Domain or Histogram of an image. In this way, the “density” of pixels and colours (assuming both target and source are the same colour) would be taken into consideration when hashing, thus resulting in a match for 2 images, being the same as the emoji shown, but rotated differently. Again, what do you think? Good/not good or implement and see for myself?
Bouarg, I solved it following your first idea with some kind of Nearest Neighbour Algorithm where the distance between images are based on the Hu-moments which are rotations invariants. My training set contains less than 60 keys and I made less than 20 attempts (after each failed attempt, the training set is getting bigger).
I do believe that we can improve the accuracy using a generative models approach.
I’m trying to solve it by using the approach described by you in the first bullet point…but unfortunately that way seems to be time-consuming and prone to false positives! :(
I will tray to improve my solution by implementing what @prolea suggested…but it’s not my favorite topic, fingers crossed! B)
Any hint from @prolea will be appreciated… :D :p
You must be logged in to reply to this discussion.
1 of 3