Quote:
Originally Posted by DeitY
How reliable is this?
|
The current models are very unreliable.
What you are seeing in the video:
- me handpicking the video where it performed best
- model trained to detect ONLY me using aimbot
- model trained to detect in a local server (almost ideal network conditions)
- trained to work only with M4
- trained with exactly one kind of aimbot
- 98% accuracy
The moment I had put the same model online:
- ~90% (IIRC) accuracy
It would not work when multiple players are involved by design (the training data that was used). For example, the SVM tries to find the maximum margin hyperplane, i.e. it would place the hyperplane such that it neatly separates my skillset from the aimbot skill set but in reality, there would be players who would have a skill set somewhere in between. Hence, it is expected to fail.
Hence, I am trying to collect data from many players. I expect to see a normally distributed variables for non-aimbots and a distribution skewed significantly to the higher skill set for aimbots. When combined, it would hopefully give a nice distribution with a bump at the higher skillset range. But I'm not really sure if it'll turn out this way. There could also be players who are as skilled as imperfect aimbots.
The AI machinery works on a transformed dataset. The raw data is analyzed and transformed into samples which are statistics about the last 8-20 shots. The collected raw data is filtered where almost all of the shots are rejected. There are heavy restrictions on the allowed samples:
- must use M4
- ping less than X
- victim must be moving
- gap between shots must be less than Y ms
- hit series must have a ratio more than 0.5+
-
more here and
here
Since the problem was complex, the idea of adding restrictions was to reduce the number of variables and ease the learning process which appears to have worked. I also switched to a simple deep neural network and it performed as good as the SVM. Currently there detections are based on two detectors running parallel: SVM and DNN. Tomorrow, I'll be adding a RF detector (which actually did pretty good).
It's also using very poorly engineered set of features for training (transformed vectors contain just 16 numbers) but for unexpected reasons, it's working so I haven't bothered putting more efforts on feature engineering until I see a significant drop in accuracy (which I will eventually).
Quote:
Originally Posted by DeitY
I have server of 900 + players count, i can let them collect insane amount of data if it helps in any way.
While i do not lay so much hope in this, at least someone is doing what Kalcor needed to fix 10 years ago.
|
Initially, I thought of collecting data from live servers and use them as negative samples. It would be a decent approximation that most of the players do not use aimbots. But then collecting a nearly equal number of aimbot data still becomes a problem. A small number of aimbot samples cannot be used because the data collected from the live servers already contain noise and data from many players who are using aimbots. The quality of the data would also be low.
Currently, the strategy that is being followed is: collect a minimal amount of data and train the detector to classify aimbot users whenever possible and not always. Hence, the aimbot samples which are being collected is OBVIOUS aimbot samples only. For example, a sample that is taken with a player shooting another player who is running in a straight line using an aimbot is not a great one since even a player since there would be similar samples without using aimbots.
Live Server Statistics
Code:
SVM Results:
positive test set statistics:
average: 0.895024, stddev: 0.202378
min: 0.16941, max: 0.998552
skewness: -2.59443, excess kurtosis: 6.30058
negative test set statistics:
average: 0.0636774, stddev: 0.10247
min: 0.000767444, max: 0.385762
skewness: 2.28472, excess kurtosis: 4.20898
true positives: 39, false positives: 0
true negatives: 38, false negatives: 3
number of samples classified corretly: 77
number of samples classified incorrectly: 3
accuracy: 0.9625
RF Results:
positive test set statistics:
average: 0.855925, stddev: 0.234815
min: 0.068, max: 1.045
skewness: -1.74412, excess kurtosis: 2.32794
negative test set statistics:
average: 0.108018, stddev: 0.148063
min: -0.0975, max: 0.5665
skewness: 1.58524, excess kurtosis: 2.14748
true positives: 37, false positives: 1
true negatives: 37, false negatives: 5
number of samples classified corretly: 74
number of samples classified incorrectly: 6
accuracy: 0.925
DNN Results:
true positives: 39, false positives: 2
true negatives: 36, false negatives: 3
number of samples classified corretly: 75
number of samples classified incorrectly: 5
accuracy: 0.9375
The accuracy looks pretty high because it was trained on a small dataset (500 samples with one kind of aimbot with data from 6 people). It'll mostly fall as more aimbots and players' data is added. Another option might be to have different detectors for each weapon, aimbot, etc. but this will also become computationally expensive.