A one-dimensional array is a row of data items, all of the same type. The word "array" is used in programming in general to describe this type of structure, but, in C++, the word "array" is used to refer rather more specifically to the implementation of this structure in C. Since C++ subsumes all of C, you can have C-style arrays in a C++ program. But C-style arrays are inflexible and in some ways awkward to use, so we will use the C++ implementation of the structure, which is a vector.
Just as a C++ string
is more than a mere row of characters - it
can also do things for you, such as telling you how long it is or providing a
substring - so a vector
is more than a mere row of, say, integers
or doubles. Like a string
, a vector
is an object
and has a number of member functions. To use vectors
in your code, you need the appropriate library - #include <vector>
.
Vectors are declared as in this example:
vector<int> ivec(4);
which declares a vector called ivec, containing four integers. In memory it looks like this:
0 | 0 | 0 | 0 |
By default the data items are initialised to zero (if they are numbers, or empty strings if they are strings). If you want to initialize the vector to a different value, add an argument to the declaration as follows:
vector<int> ivec(4, 3);
This would create a vector of four elements, each one initialized to 3. ivec
looks like this:
3 | 3 | 3 | 3 |
The elements of a vector are numbered, starting from zero. So the first element of ivec
would be ivec[0]
, the second element would be ivec[1]
, and so on. These numbers (the 0 and the 1) are
called subscripts.
To refer to individual elements of the vector you use the element's subscript
inside square brackets. Note that the first element of the vector has subscript
zero. In other respects, the elements of an integer vector behave just like ordinary integer variables, as in these examples:
ivec[2] = 95; ivec[0] = ivec[2] + ivec[3];
After these operations, ivec looks like this:
98 | 3 | 95 | 3 |
You can use the elements of the vector in the same way as you would any other variable of the same type, for example:
x = ivec[0] - ivec[2]; if (ivec[0] < ivec[3]) return;
The thing that goes into the square brackets is actually an expression (integer constants and single variables are simple expressions). So, as well as having things like ivec[2]
, you can also have ivec[x]
, where x
is an integer variable, or ivec[x-2]
or ivec[x+y-z]
When you use vectors you must take care that you do not try to reference an element
beyond the bounds of the vector by using a subscript outside the valid values for
that vector - for example, trying to access ivec[-1]
or ivec[4]
or ivec[99]
given ivec
as defined above. The operating system will not check this for you and you may find
yourself using values from undefined portions of memory or, worse, overwriting data. At best
the o/s might issue a segmentation fault error if you try to use a subscript that
is way out of range. You might also consider using the at
function in preference to the square brackets. ivec.at(92)
is the same as ivec[92]
except that it will terminate the program if ivec[92]
does not exist. The square bracket notation is widely used in other programming languages and in writing about programming generally.
Let us create a vector v of eight integers and initialize it so that each element's value is the same as its index.
vector<int> v(8); for (int i = 0; i < 8; i++) v[i] = i;
v
looks like this:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
The following code causes any negative elements to be set to zero:
for (int i = 0; i < 8; i++) if (v[i] < 0) v[i] = 0;
The size()
member function for vectors, like the length()
member function for strings, returns the number of elements in the vector. Using v
from the previous example, v.size()
returns a value of 8. Note that, because subscripts begin at zero, v.size()
is always one more than
the highest subscript of v
.
You can use the = (assignment) operator to assign one vector to another, for example v1
= v2
, so long as they are vectors of the same type (eg both vector<int>
or vector<double>
). In this case the contents of v1 are overwritten with
the contents of v2 and v1 is truncated or extended to
be the same length as v2.
This function allows you to add elements to the end of a vector. Suppose you create an uninitialised vector as follows:
vector<int> evec;
so that evec.size()
is zero. evec is effectively useless
until you add some elements to it. The following statement will add an element
to the end (the back) of evec and initialise it with the value 21.
evec.push_back(21);
At this point, evec[0]
is 21 and evec.size()
is one.
Now we can add another element and initialise it with the value 33:
evec.push_back(33);
Now evec[0]
is 21, evec[1]
is 33 and evec.size()
is two.
This function removes the last element from the vector. A typical call looks like this:
evec.pop_back();
Unfortunately pop_back()
doesn't return the value of the popped
element. If you need it, you have to copy it to somewhere else before you pop, for example:
to_be_popped = evec[evec.size()-1]; evec.pop_back();
Let us consider a function that returns the sum of the elements of a vector of integers:
int total(vector<int> v) { int total = 0; for (int i = 0; i < v.size(); i++) total += v[i]; return total; }
Note that the vector here is passed by value, meaning that the vector's data will
be copied for use locally. If the vector is small, there is no problem, but
if it is large, then the parameter-passing itself could consume a lot of resources. For this reason
it is a good habit to pass vectors by reference, and use the const
keyword to ensure that the data doesn't get inadvertently modified.
Here is a function that takes as its arguments a vector of integers and an
integer. It returns true
if the integer is in the vector.
bool isin(const vector<int>& v, int a) { for (int i = 0; i < v.size(); i++) if (v[i] == a) return true; return false; }
Let us look at the following code fragment for loading words into a vector:
string s; vector<string> v; while (cin >> s) v.push_back(s);
The system will continue to populate the vector v until the input stream fails.
The following procedure also adds new words to a vector but does not store duplicates:
void add_word(vector<string>& vs, string s) // the procedure modifies the vector, // so we pass by reference (without const) { for (int i = 0; i < vs.size(); i++ ) if (s == vs[i]) return; // don't add the word as it's // already in the vector vs.push_back(s); }
We could call this procedure thus:
string s; vector<string> v; while (cin >> s) add_word(v,s);
In C++ a string behaves in many ways like a vector of char
elements; in particular, you can use the square-bracket operator to access a single character. char
is a datatype; a char
is a single character. For example:
string s = "eyrie"; s[1] = s[0]; // s is now "eerie" string t = "cat"; char ch = t[0]; // ch is the letter 'c'; note that ch is of type char, not type string t[0] = t[1]; t[1] = ch; // t is now "act"
string
literals are enclosed in double quotes "Z"
is a string of length one
whereas char
literals are enclosed in single quotes 'Z'
is a single character value.
You can also define strings using the same sort of constructor function that you use for vectors. For example:
string s(20, '*'); // creates a string of length 20, initialised to asterisks; note that the second argument is a char, not a string
The library cctype
contains a number of useful predicate functions
for returning information about char
data, for example:
isupper(ch) |
returns true if ch is upper case |
islower(ch) |
returns true if ch is lower case |
isdigit(ch) |
returns true if ch is a digit |
isalpha(ch) |
returns true if ch is a letter |
Many languages draw a clear distinction between the types char
and int
but C and C++, to a large extent, treat them as interchangeable, with implicit type conversions from one to the other. For example, you could do the following:
int n; char ch; n = 'A' + '!'; ch = n; cout << n << " " << ch << endl;and the output would be 98 b. That is, you get the ASCII value of 'A' added to the ASCII value of '!', giving 98, and ch takes the value of the char with ASCII value 98, which is 'b'.
This kind of thing is generally confusing and best avoided, but it can occasionally be useful. If, for example, you had a char variable with a digit as its value, you could convert the digit to the corresponding integer value simply by subtracting '0'. For example:
char ch = '9'; int n = ch - '0'; // 57 - 48, so n is initialized to the value 9or vice-versa:
int n = 9; char ch = n + '0'; // ch is initialized to the value '9'
C++ strings are different from C strings. C strings are arrays of char
,
not vectors (more about arrays and C strings later in the course). If you needed to cast a C++ string into a C string,
you would use s.c_str()
(where s
is a string).
You can use the at
function with strings, as you can with vectors, as an alternative to square brackets so as to get the benefit of array bounds checking. It can be used as follows:
string s = "hello, world!"; cout << s.at(7) << endl; // returns letter w s.at(12) = '?'; // changes ! to ? cout << s.at(12) << endl; cout << s.at(s.length()) << endl; // aborts the program - there is no s[13]
However, a string
is not exactly the same as a vector <char>
. For example, a string
has push_back
(you could have s.push_back(ch);
to add a char to a string) but it does not have a pop_back
.
Finally a word of warning. Because vectors in C++ are so flexible and versatile, novice programmers are inclined to overuse them. Use a vector where it's needed or where it will really help. Don't use one just because you can.
Say you are writing a program that reads in a file of numbers and outputs the largest. Some programmers, especially those who have just discovered vectors, will read the entire file of numbers into a vector and will then search through the vector looking for the highest. But a moment's thought will show that the vector here is not serving any purpose. You only need to inspect the numbers one at a time, and you can do that when you read them from the file in the first place.
Unnecessary vectors, apart from being a waste of computer storage, tend to clutter a program and, often, to confuse the programmer. The vector is a very useful tool, but don't use it always and for everything.
Notes on R. Mitton's lectures by S.P. Connolly, edited by R. Mitton, 2000