Learn Python by Building Data Science Applications
上QQ阅读APP看书,第一时间看更新

Var-positional and var-keyword

In some cases, we don't know beforehand the exact number of arguments that will be passed to the function. For example, the print function will take and print any number of arguments. The same can be done for our custom function, using *args. In this case, args itself is not a keyword, but rather a conventional name of all those multiple values, packed – technically, you can use any name you want. All of the actual job is done by the asterisks – it indicates that all passed values will be packed into one tuple (we'll discuss tuples in Chapter 4, Data Structures). Within the function's scope, you can use args as a tuple variable, or pass it using the same asterisks.

Similarly, in some cases, those multiple values are named. In order to pass multiple name arguments, use **kwargs (which stands for keyword arguments). As with args, the name itself is a mere convention; all the heavy lifting is done by the double asterisks, which groups those arguments into a dictionary (another type of data structure that will be explained in the next chapter). Within the function, kwargs can be used as a single entity (dictionary), or passed down using similar double asterisks. Both args and kwargs should be defined after all normal arguments. Consider this example:

def f(a, b, *args, **kwargs):
return a + b

This function simply adds two arguments. It will, however, accept any number of additional arguments without any effect on its result. Here is an illustration.

Here, we pass two first arguments as 1 and 2, respectively. After that, we pass 10. Lastly, we pass one more, a named argument. As we specified both *args and **kwargs in the function, both these parameters won't raise any errors on the execution. In fact, they will be completely ignored, as we use neither args nor kwargs in the function code, as shown here:

f(1,2, 10, other_argument=0)
>>> 3

args and kwargs are indeed very useful. One frequently encountered case is when your function runs an external function within your code, with plenty of optional parameters. Instead of declaring all of them one more time in your function, you can just pass kwargs. This way, the code will remain both concise and flexible (also, you won't need to change your code if the external function's interface changes). Consider the following example:

# sets up plotting in jupyter
%matplotlib inline

import pylab as plt


def draw_scatter(x, y, color='k', **kwargs):
plt.scatter(x, y, color=color, **kwargs)

In the preceding code, we use pylab – one of the interfaces for matplotlib, a data visualization library. Like most of the visualization functions, plt.scatter (which, you guessed, is here to draw a scatter plot) has dozens of optional parameters, defining the title, title font, title size, the same for the x axis title, and the y axis title, as well as color, shape, opacity, size, position of the markets, and many other parameters besides. It would be insane to replicate all those options (and their documentation) within your code. Instead, we can pass kwargs as a set of the arguments you want to pass to the scatter function.

Here is an example of function usage. As you can see, we can pass any variable that the original scatter function accepts – and it will be passed, and used:

draw_scatter([1,2,3], [3,2,1], s=[10, 100, 300])

As a result, we get the following diagram. Obviously, the data is meaningful, but the size argument we passed is reflected in the resulting chart:

In the preceding example, we imported the  pylab library, one of the interfaces for the  matplotib plotting language. The particle as in the import line,  import pylab AS plt, allows us to define alias names for the libraries. While the name of the alias is arbitrary and could be anything you want, most popular libraries, including pylab, have widely adopted, near-standard aliases. It makes sense to stick with the popular ones to make the code easier to read and understand. Some other libraries with well-known aliases are pandas ( pd), numpy ( np), and seaborn ( sns).

We'll cover import in depth in the following chapters.