You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: posts/blog/sqli_bin.md
+55-7Lines changed: 55 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -75,7 +75,7 @@ Look at this example, in which the site will return a positive (OK) response if
75
75
```
76
76
</div>
77
77
78
-
Due to the presence of the vulnerability, we can therefore check whether a condition is true or false, for instance by entering as id **" OR 1=1 --** we would get the answer 200, reasoning on this fact, we can, via the keyword union go to other tables and for example we could check that the ith character of a user's password corresponds to a certain character.
78
+
Due to the presence of the vulnerability, we can therefore check whether a condition is true or false, for instance by entering as id **" OR 1=1 --** we would get the answer 200, reasoning on this fact, we can, via the keyword __union__ search in other tables and for example we could check that the $i$th character of a user's password corresponds to a certain character.
79
79
To do this we can, for example, use the [SUBSTR](https://dev.mysql.com/doc/refman/8.4/en/string-functions.html#function_substr) function.
@@ -88,8 +88,8 @@ To do this we can, for example, use the [SUBSTR](https://dev.mysql.com/doc/refma
88
88
```
89
89
</div>
90
90
91
-
In this case, atleast in MySql the pos parameter indicate the starting position from which the substring will be created and the len parameter the number of character which need to be included starting from pos.
92
-
so, by setting **pos=i** and **len=1** we get the ith character.
91
+
In this case, atleast in __MySql__ the __pos__ parameter indicate the __starting position__ from which the substring will be created and the __len__ parameter the number of character which __need to be included starting from pos__.
92
+
so, by setting **pos=i** and **len=1** we get the $i$th character.
93
93
94
94
Awesome!
95
95
@@ -129,7 +129,7 @@ Great, this is an example of what we can do by abusing the website response. Obv
129
129
In the upper algorithm, we will overlook the fact that our search algorithm will be executed $s$ times, where $s = len(brute) \textrm{,} \ s \in \mathbb{N}$.
130
130
131
131
We used the linear search algorithm, which is known to have complexity of $\mathcal{O}(n) \textrm{, where, in this case, } \ n = |charset| $
All this, assuming that the cost of querying the database is constant, in this case we will treat it as a variable $q \in \mathbb{R}^+$.
134
134
So the time complexity of this algorithm is $\mathcal{O}(nq)$, because the code inside the loop get executed $n$ time and each time the cost is $q$.
135
135
@@ -138,7 +138,7 @@ So the time complexity of this algorithm is $\mathcal{O}(nq)$, because the code
138
138
139
139
## An optimization, introducing binary search
140
140
141
-
As computer scientists, we want to be able to achieve the best performance, and in this case, what will enable us to achieve this is the fact that we can consider our "array" as ordered.
141
+
As computer scientists, we want to be able to achieve the __best performance__, and in this case, what will enable us to achieve this is the fact that we can consider our "array" as __ordered__.
142
142
143
143
Informally, the idea that we may have, as we know that the array is ordered so that $A_0 \leq A_1 \leq A_2 \leq ... \leq A_{n-1}$
144
144
is the fact that instead of trying to search on all the elements, assuming that the element $\in A$, we can check whether the element at position $\left \lfloor \frac{start+end}{2} \right \rfloor$ (the middle) matches, if it is not, we check whether the element at the middle is major, if it is, we know that our element will be in the $[start,mid - 1]$ range because we know that the array is sorted, otherwise it will be in the upper half, namely $[mid + 1,end]$.
@@ -168,7 +168,7 @@ A[0] == to_find #is 1 equal to 1? yes, found at index 0!
168
168
</div>
169
169
170
170
171
-
In this way we can each time half the range in which we search! We are going to prove the time complexity later.
171
+
In this way we can __each time half the range__ in which we search! We are going to prove the time complexity later.
172
172
173
173
As computer scientists and mathematicians we want to generalise this reasoning, so here is the pseudocode, using the recursive variant of binary search.
174
174
@@ -224,7 +224,7 @@ print(bruteforced)
224
224
```
225
225
</div>
226
226
227
-
The functions $g$ and $eq$ are the equality conditions we are looking for in the case of a blind sql injection.
227
+
The functions $g$ and $eq$ are the __equality conditions__ we are looking for in the case of a blind sql injection.
228
228
229
229
The code mirrors the reasoning set out earlier, the only thing we need to pay attention to is the base case of our recursive function, i.e. $end \leq start$, the moment $start > end$, we are certain that our element $ \notin A$ .
230
230
@@ -298,4 +298,52 @@ We have proven the correctness of binary search! Now let's talk about performanc
298
298
299
299
## Time complexity
300
300
301
+
In computer science, one way to describe the performance of an algorithm is to assign a cost to a given instruction, and to count, the number of times all instructions are executed (taking into account the relative cost, of course).
302
+
To provide an approximation to this number, mathematical notations have been introduced to provide an upper or lower bound (or both) to the number of operations that are executed.
303
+
Here is an example, we will take knowledge of these notations for granted.
In the case of recursive functions, which use the paradigm called divide and conquer, we can express the execution time T(n) using a mathematical concept called recurrence relation or recursive succession.
308
+
A recurrence relation is nothing more than a succession in which the value of each term depends on all the values of the terms before it.
309
+
310
+
One example is the famous fibonacci succession, defined as follows: $$ F(n) = \begin{cases}
311
+
\ 1 & \textrm{if} \ \ n = 1 \lor n = 2 \newline
312
+
\ F(n-1) + F(n-2) & \textrm{otherwise}
313
+
\end{cases} $$
314
+
For example, to find the value of $F(4)$, all we have to do is $F(3) + F(2) = F(2) + F(1) + F(2) = 3$.
315
+
316
+
As far as the binary search algorithm is concerned, with a bit of sloppiness, its execution time is given by: $$ T(n) = \begin{cases}
317
+
\ 1 & \textrm{if} \ \ n = 1 \newline
318
+
\ T(\frac{n}{2}) + 1 & \textrm{if} \ \ n > 1
319
+
\end{cases} $$
320
+
321
+
To find the running time, we generally have several methods, such as iterative, recurrence trees, substitution method and master theorem.
322
+
For simplicity's sake, we will use the iterative method, which allows us to write our recursive function in a non-recursive manner (i.e. not dependent on previously assumed values).
323
+
This technique coinsists in continuing to pass smaller and smaller inputs to our recursive function, which will allow us to find patterns from which to deduce the execution time.
324
+
Let's see it in action.
325
+
326
+
$$ T(n) = T(\frac{n}{2}) + 1 $$
327
+
We pass $\frac{n}{2}$ as input to $T(n)$ and so on.
From these considerations, we can rewrite $T(n)$ explicitly as: $T(n) = T(\frac{n}{2^k}) + k$, where $k$ is the number of times we have performed the substitution.
337
+
Now that we have written it as a function of k, we want to know after how many steps our function will reach the base case.
338
+
We know that the base case is reached when $n=1$, so all we need is solve this equation is to solve this equation in terms of $k$.
339
+
$$ \frac{n}{2^k} = 1 \Rightarrow k = \log_2(n)$$
340
+
Knowing this, we can state that: $$T(n) = T(\frac{n}{2^{\log_2(n)}}) + \log_2(n) = \log_2(n) + 1 = \mathcal{O}(log_2(n))$$
341
+
342
+
So, we have just proved that given an ordered array $A, \ \ |A| = n$, binary search will find whether an elemement $a \in A$ or not in logarithmic time!
343
+
344
+
## Conclusion
345
+
346
+
<imgsrc="/images/blog_images/plot.png">
347
+
348
+
These are the __graphs__ of a __linear__ and __logarithmic__ function, you can immediately see that as $n$ increases we have a __large saving__ in operations.
349
+
For the way we have structured our __blind sql injection algorithm__, we will be able to find a character in $\log_2(128)$ steps, i.e. $7$ times, unlike the $128$ for linear search, and remember that the __larger__ the input becomes (think utf-8 strings) the **greater the gain!**
0 commit comments