We begin with an empty set, represented as:
{}
We construct a set that contains the empty set:
{ {} }
We construct a set that contains both the set that contains the empty set, and the empty set itself:
{ {}, { {} } }
We construct a set that contains the previous set, the one before the previous set, and the empty set:
{ {}, { {} }, { {}, { {} } } }
And so on.
Now, we define something called 'cardinality', and put each constructed set into a 1-to-1 corresponding relationship with each 'thing' that is a cardinality:
{} -----------> Thing A
{{}} -----------> Thing B
{{},{{}}} -----------> Thing C
{{},{{}},{{},{{}}}} -----------> Thing D
... and so on.
We choose to give alternative names to these 'things'.
Thing A will be '0'.
Thing B will be '1'.
Thing C will be '2'.
Thing D will be '3'.
... and so on.
As previous posters have already stated, it doesn't really matter what we call each thing.
We could call them apples and bananas; it would just be inconvenient
The 'Essence of Numbers', as you put it, is in the relationship between the sets. The set represented by Thing A is always 'smaller' than Thing B because we can construct Thing B from Thing A, but not the other way around.