In contact with Facebook Twitter RSS feed

Statistical study of relationships in statistics. Statistical study of relationships between phenomena. The concept of functional and correlation connection

2. Methods for identifying correlations

3. Univariate correlation-regression analysis

4. Multivariate correlation and regression analysis

5. Nonparametric measures of connection

1. Types of relationships and the concept of correlation dependence

All statistical indicators are interconnected in certain connections and relationships.

The task of statistical research is to determine the nature of this relationship.

The following types of relationships exist:

1. Factorial. In this case, connections are manifested in a coordinated variation of various characteristics in the same population. In this case, one of the signs acts as a factor, and the other as a consequence. The study of these connections is carried out by the method of groupings, as well as the theory of correlation .

2. Component. This type includes such relationships in which a change in a complex phenomenon is entirely determined by a change in the components included in this complex phenomenon as factors (X = x f). For this, the index method is used.

For example, using a system of interconnected indices, they learn how trade turnover has changed due to changes in the number of goods sold and prices.

3. Balance sheet. They are used to analyze connections and proportions in the formation of resources and their distribution. The balance represents a system of indicators, which consists of two sums of absolute values, interconnected by an equal sign,

a + b = c + d.

For example, the balance of material resources:

balance + receipt = expense + balance

initial ending

Signs (indicators) when studying relationships are divided into 2 types:

Signs that cause changes in others are called factorial, or simply factors.

Signs, changing under the influence of factor characteristics, are effective.

There are 2 types of relationships: functional And stochastic.

Functional is a relationship in which only one value of the resultant characteristic corresponds to a certain value of a factor characteristic.

If a causal dependence does not appear in every case, but in general, on average, with a large number of observations, then such a relationship is called stochastic.

A special case of stochastic coupling is correlation connection, in which the change in the average value of the resultant characteristic is due to a change in the factor characteristic.

Features of stochastic (correlation) connections:

They are found not in isolated cases, but in general and on average with a large number of observations;

- incomplete, they do not take into account all existing factors, but only the essential ones;

Irreversible. For example, a functional relationship can be transformed into

another functional connection. If we say that productivity

agricultural products depends on the amount of fertilizer applied, then the opposite statement makes no sense.

Towards highlight the connection direct And reverse. At direct communication As the factor attribute increases, the resultant attribute increases. When feedback As the factor attribute increases, the resultant attribute decreases.

By analytical expression highlight connections linear (straight-line) And nonlinear (curvilinear). If the connection between phenomena is expressed by the equation of a straight line, then it linear. If the relationship is expressed by the equation of a curved line (parabola, hyperbola, power, exponential, etc.), then it nonlinear.

By number of factors, acting on the resultant sign, there are connections single-factor And multifactorial. If one sign is a factor and an effective sign, then the relationship is one-factor (paired regression). If there are 2 or more factors, then the relationship is multifactorial (multiple regression).

Connections are also distinguished by degree tightness of communication(see Chaddock table).

8.1. Basic concepts of correlation and regression analysis

When studying nature, society, and the economy, it is necessary to take into account the interrelationship of observed processes and phenomena. In this case, the completeness of the description is one way or another determined by the quantitative characteristics of the cause-and-effect relationships between them. Assessing the most significant of them, as well as the impact of some factors on others, is one of the main tasks of statistics.

The forms of manifestation of relationships are very diverse. As the two most common types of them highlight functional(full) and correlation(incomplete) connection. In the first case, the value of the factor characteristic strictly corresponds to one or more function values. Quite often, functional connections appear in physics and chemistry. In economics, an example is the directly proportional relationship between labor productivity and increased production.

A correlation relationship (which is also called incomplete, or statistical) appears on average, for mass observations, when the given values ​​of the dependent variable correspond to a certain number of probable values ​​of the independent variable. The explanation for this is the complexity of the relationships between the analyzed factors, the interaction of which is influenced by unaccounted random variables. Therefore, the connection between the signs appears only on average, in the mass of cases. In a correlation relationship, each argument value corresponds to function values ​​randomly distributed in a certain interval.

For example, a slight increase in the argument will entail only an average increase or decrease (depending on the direction) of the function, while specific values ​​for individual observation units will differ from the average. Such dependencies are found everywhere. For example, in agriculture, this could be the relationship between yield and the amount of fertilizer applied. Obviously, the latter are involved in the formation of the crop. But for each specific field or plot, the same amount of applied fertilizer will cause a different increase in yield, since a number of other factors interact (weather, soil condition, etc.), which form the final result. However, on average, such a relationship is observed - an increase in the mass of applied fertilizers leads to an increase in yield.

According to the direction of communication there are straight, when the dependent variable increases with increasing factor attribute, and reverse, in which the growth of the latter is accompanied by a decrease in function. Such connections can also be called positive and negative, respectively.

Regarding their analytical form, connections are linear And nonlinear. In the first case, linear relationships appear on average between the characteristics. A nonlinear relationship is expressed by a nonlinear function, and the variables are related to each other nonlinearly on average.

There is another quite important characteristic of connections from the point of view of interacting factors. If the connection between two characteristics is characterized, then it is usually called steam room. If more than two variables are studied − multiple.

The above classification criteria are most often found in statistical analysis. But in addition to those listed, there are also direct, indirect And false communications. Actually, the essence of each of them is obvious from the name. In the first case, the factors interact directly with each other. An indirect connection is characterized by the participation of some third variable that mediates the relationship between the characteristics being studied. A false connection is a connection established formally and, as a rule, confirmed only by quantitative estimates. It has no qualitative basis or is meaningless.

Vary in strength weak And strong communications. This formal characteristic is expressed in specific quantities and is interpreted in accordance with generally accepted criteria for the strength of connection for specific indicators.

In the most general form, the task of statistics in the field of studying relationships is to quantify their presence and direction, as well as to characterize the strength and form of influence of some factors on others. To solve it, two groups of methods are used, one of which includes methods of correlation analysis, and the other - regression analysis. At the same time, a number of researchers combine these methods into correlation-regression analysis, which has some basis: the presence of a number of general computational procedures, complementarity in the interpretation of results, etc.

Therefore, in this context, we can talk about correlation analysis in a broad sense - when the relationship is comprehensively characterized. At the same time, there is a correlation analysis in the narrow sense - when the strength of the connection is examined - and regression analysis, during which its form and the impact of some factors on others are assessed.

The tasks themselves correlation analysis are reduced to measuring the closeness of the connection between varying characteristics, determining unknown causal relationships and assessing the factors that have the greatest influence on the resulting characteristic.

Tasks regression analysis lie in the area of ​​establishing the form of the dependence, determining the regression function, and using an equation to estimate the unknown values ​​of the dependent variable.

The solution to these problems is based on appropriate techniques, algorithms, indicators, the use of which gives grounds to talk about the statistical study of relationships.

It should be noted that traditional methods of correlation and regression are widely represented in various statistical software packages for computers. The researcher can only prepare the information correctly, select a software package that meets the analysis requirements and be ready to interpret the results obtained. There are many algorithms for calculating communication parameters, and at present it is hardly advisable to carry out such a complex type of analysis manually. Computational procedures are of independent interest, but knowledge of the principles of studying relationships, possibilities and limitations of certain methods of interpreting results is a prerequisite for research.

Methods for assessing the strength of a connection are divided into correlation (parametric) and nonparametric. Parametric methods are based on the use, as a rule, of estimates of the normal distribution and are used in cases where the population being studied consists of values ​​that obey the law of normal distribution. In practice, this position is most often accepted a priori. Actually, these methods are parametric and are usually called correlation methods.

Nonparametric methods do not impose restrictions on the distribution law of the studied quantities. Their advantage is the simplicity of calculations.

8.2. Pairwise correlation and pairwise linear regression

The simplest technique for identifying the relationship between two characteristics is to construct correlation table:

\Y
\
X\
Y 1 Y2 ... Y z Total Y i
X 1 f 11 12 ... f 1z
X 1 f 21 22 ... f 2z
... ... ... ... ... ... ...
Xr f k1 k2 ... f kz
Total ... n
... -

The grouping is based on two characteristics studied in the relationship - X and Y. Frequencies f ij show the number of corresponding combinations of X and Y. If f ij are located randomly in the table, we can talk about the absence of a relationship between the variables. In the case of the formation of any characteristic combination f ij, it is permissible to assert a connection between X and Y. Moreover, if f ij is concentrated near one of the two diagonals, a direct or inverse linear connection takes place.

A visual representation of the correlation table is correlation field. It is a graph where X values ​​are plotted on the abscissa axis, Y values ​​are plotted on the ordinate axis, and the combination of X and Y is shown with dots. By the location of the dots and their concentrations in a certain direction, one can judge the presence of a connection.

In the results of the correlation table, two distributions are given in rows and columns - one for X, the other for Y. Let us calculate the average value of Y for each Xi, i.e. , How

The sequence of points (X i, ) gives a graph that illustrates the dependence of the average value of the effective attribute Y on the factor X, – empirical regression line, clearly showing how Y changes as X changes.

Essentially, both the correlation table, the correlation field, and the empirical regression line already preliminarily characterize the relationship when the factor and resultant characteristics are selected and it is necessary to formulate assumptions about the form and direction of the relationship. At the same time, quantitative assessment of the tightness of the connection requires additional calculations.

In practice, linear methods are widely used to quantify the tightness of a connection. correlation coefficient. Sometimes it is simply called the correlation coefficient. If the values ​​of the variables X and Y are given, then it is calculated using the formula

You can use other formulas, but the result should be the same for all calculation options.

The correlation coefficient takes values ​​in the range from -1 to + 1. It is generally accepted that if |r| < 0,30, то связь слабая; при |r| = (0.3÷0.7) – average; at |r| > 0.70 – strong or tight. When |r| = 1 – functional connection. If r takes a value of about 0, then this gives reason to talk about the absence of a linear relationship between Y and X. However, in this case, nonlinear interaction is possible. which requires additional verification of other meters discussed below.

Regression analysis methods are used to characterize the impact of changes in X on variations in Y. In the case of a paired linear relationship, a regression model is built

where n number of observations;
a 0 and 1 are unknown parameters of the equation;
e i is the error of the random variable Y.

The regression equation is written as

where Y itheor is the calculated equalized value of the resultant characteristic after substitution into equation X.

Parameters a 0 and a 1 are estimated using procedures, the most widely used of which is least square method. Its essence lies in the fact that the best estimates ag and a are obtained when

those. the sum of squared deviations of the empirical values ​​of the dependent variable from those calculated using the regression equation should be minimal. The sum of squared deviations is a function of the parameters a 0 and a 1. Its minimization is carried out by solving the system of equations

You can also use other formulas arising from the least squares method, for example:

The linear regression apparatus is quite well developed and, as a rule, is available in a set of standard correlation assessment programs for computers. The meaning of the parameters is important: and 1 is a regression coefficient characterizing the effect that a change in X has on Y. It shows how many units on average Y will change when X changes by one unit. If a is greater than 0, then a positive relationship is observed. If a has a negative value, then an increase in X by one entails a decrease in Y on average by a 1. The parameter a 1 has the dimension of the ratio Y to X.

The parameter a 0 is a constant in the regression equation. In our opinion, it has no economic meaning, but in a number of cases it is interpreted as the initial value of Y.

For example, based on data on the cost of equipment X and labor productivity Y, the equation was obtained using the least squares method

Y = -12.14 + 2.08X.

Coefficient a means that an increase in the cost of equipment by 1 million rubles. leads on average to an increase in labor productivity by 2.08 thousand rubles.

The value of the function Y = a 0 + a 1 X is called the calculated value and forms on the graph theoretical regression line.

The meaning of theoretical regression is that it is an estimate of the average value of the variable Y for a given value of X.

Paired correlation or paired regression can be considered as a special case of reflecting the relationship between some dependent variable, on the one hand, and one of many independent variables, on the other. When it is necessary to characterize the relationship of the entire specified set of independent variables with the resulting characteristic, we speak of multiple correlation or multiple regression.

8.3. Assessing the significance of relationship parameters

Having obtained correlation and regression estimates, it is necessary to check them for compliance with the true parameters of the relationship.

Existing computer programs usually include several of the most common criteria. To assess the significance of the pairwise correlation coefficient, the standard error of the correlation coefficient is calculated:

As a first approximation, it is necessary that . The significance of r xy is checked by comparing it with , and we get

where t calculated is the so-called calculated value of the t-criterion.

If t calculated is greater than the theoretical (tabular) value of the Student's test (t tab) for a given level of probability and (n-2) degrees of freedom, then it can be argued that r xy is significant.

In the same way, based on the corresponding formulas, the standard errors of the parameters of the regression equation are calculated, and then the t-tests for each parameter. It is important again to check that the condition t calculated > t table is met. Otherwise, there is no reason to trust the obtained parameter estimate.

The conclusion about the correctness of the choice of the type of relationship and the characteristics of the significance of the entire regression equation are obtained using the F-criterion, calculating its calculated value:

where n is the number of observations;
m is the number of parameters of the regression equation.

F calculated should also be greater than F theoretical at v 1 = (m-1) and v 2 = (n-m) degrees of freedom. Otherwise, you should reconsider the form of the equation, the list of variables, etc.

8.4. Nonparametric methods for estimating relationships

The methods of correlation and variance analysis are not universal: they can be used if all the characteristics being studied are quantitative. When using these methods, it is impossible to do without calculating the main parameters of the distribution (mean values, variances), so they are called parametric methods.

Meanwhile, in statistical practice one has to deal with problems of measuring the relationship between qualitative characteristics, to which parametric methods of analysis in their usual form are not applicable. Statistical science has developed methods that can be used to measure the relationship between phenomena without using quantitative values ​​of the attribute, and therefore distribution parameters. Such methods are called nonparametric.

If the relationship between two qualitative characteristics is studied, then the combinational distribution of population units is used in the form of the so-called tables of mutual contingency.

Let us consider the methodology for analyzing tables of mutual contingency using a specific example of social mobility as a process of overcoming the isolation of individual social and professional groups of the population. Below are data on the distribution of secondary school graduates by area of ​​employment, highlighting similar social groups of their parents.

The distribution of frequencies across the rows and columns of the cross-contingency table allows us to identify the main patterns of social mobility: 42.9% of children of parents in group 1 (“Industry and construction”) are employed in the field of intellectual labor (39 out of 91); 38.9% of children. whose parents work in agriculture, work in industry (34 out of 88), etc.

One can also notice obvious heredity in the transmission of professions. Thus, of those who came to agriculture, 29 people, or 64.4%, are children of agricultural workers; more than 50% in the field of intellectual work have parents belonging to the same social group, etc.

However, it is important to obtain a general indicator that characterizes the closeness of the connection between characteristics and allows one to compare the manifestation of the connection in different populations. For this purpose, they calculate, for example, mutual contingency coefficients Pearson (S) and Chuprov (C):

where f 2 is the mean square conjugacy indicator, determined by subtracting one from the sum of the ratios of the squared frequencies of each cell of the correlation table to the product of the frequencies of the corresponding column and row:

K 1 and K 2 – the number of groups for each of the characteristics. The value of the coefficient of mutual contingency, reflecting the closeness of the connection between qualitative characteristics, fluctuates within the usual range for these indicators from 0 to 1.

In socio-economic research, situations are often encountered when a characteristic is not expressed quantitatively, but the units of the population can be ordered. This ordering of population units by attribute value is called ranking. Examples could be ranking students (pupils) by ability, any group of people by level of education, profession, ability to be creative, etc.

When ranking, each unit in the population is assigned rank, those. serial number. If the value of a characteristic is the same for different units, they are assigned a combined average ordinal number. For example, if the 5th and 6th population units have the same feature values, both will receive a rank equal to (5 + 6) / 2 = 5.5.

Measuring the relationship between ranked features is done using rank correlation coefficients Spearman (r) and Kendall (t). These methods are applicable not only for qualitative, but also for quantitative indicators, especially with a small population size, since non-parametric rank correlation methods are not associated with any restrictions regarding the nature of the distribution of the characteristic.

Previous

The study of objectively existing connections between socio-economic phenomena and processes is the most important task of the theory of statistics. In progress

Statistical research of dependencies reveals cause-and-effect relationships between phenomena, which makes it possible to identify factors (signs) that have a major influence on the variation of the phenomena and processes being studied. Cause-and-effect relationships are such a connection between phenomena and processes when a change in one of them - the cause - leads to a change in the other - the effect.

Financial and economic processes are the result of the simultaneous influence of a large number of causes. Consequently, when studying these processes, it is necessary to identify the main, main causes, abstracting from the secondary ones.

The first stage of the statistical study of communication is based on qualitative analysis associated with the analysis of the nature of a social or economic phenomenon using the methods of economic theory, sociology, and concrete economics. The second stage - building a communication model, is based on statistical methods: groupings, average values, and so on. The third and final stage, interpretation of the results, is again associated with the qualitative features of the phenomenon being studied. Statistics has developed many methods for studying relationships. The choice of method for studying communication depends on the cognitive purpose and objectives of the study.

Signs, according to their essence and significance for studying the relationship, are divided into two classes. Signs that cause changes in other associated features are called factorial, or simply factors. Characteristics that change under the influence of factor characteristics are called effective.

In statistics, a distinction is made between functional and stochastic dependencies. Functional is a relationship in which a certain value of a factor characteristic corresponds to one and only one value of the resultant characteristic.

If a causal dependence does not appear in each individual case, but in general, on average, with a large number of observations, then such a dependence is called stochastic. A special case of stochastic coupling is correlation a relationship in which a change in the average value of a resultant characteristic is due to a change in factor characteristics.

Connections between phenomena and their characteristics are classified according to the degree of closeness,

direction and analytical expression.

According to the degree of closeness of the connection, they are distinguished:

With an increase or decrease in the values ​​of the factor characteristic, there is an increase or decrease in the values ​​of the resulting characteristic. Thus, an increase in production volumes contributes to an increase in the profit of the enterprise. When reverse connections, the values ​​of the resulting characteristic change under the influence of the factor characteristic, but in the opposite direction compared to the change in the factor characteristic, that is reverse– this is a relationship in which, with an increase or decrease in the values ​​of one characteristic, there is a decrease or increase in the values ​​of another characteristic. Thus, a reduction in the cost per unit of production entails an increase in profitability.

According to the analytical expression, connections are distinguished straight(or simply whether-

neynye) And nonlinear. If a statistical relationship between phenomena can be applied

is approximately expressed by the equation of a straight line, it is called linear type connection.

Send your good work in the knowledge base is simple. Use the form below

Students, graduate students, young scientists who use the knowledge base in their studies and work will be very grateful to you.

Posted on http://www.allbest.ru/

Lecture

Topic: Statistical study of the relationship between indicators

1. Methods of correlation and regression analysis of the relationship between commercial activity indicators

The study of relationships in the market for goods and services is the most important function of economic workers. It is important that studying the relationship between indicators of commercial activity is necessary not only to establish the existence of a connection. In order to scientifically substantiate forecasting and rational management of the mechanism of market relations, it is important to give mathematical certainty to the identified connections. Without a quantitative assessment of the pattern of communication, it is impossible to bring the results of economic developments to such a level that they could be used for practical purposes.

Statistical indicators of commercial activity, reflecting the objective interdependence of individual aspects of commercial activity, may consist of the following main types of communication:

The balance sheet relationship between commercial activity indicators characterizes the relationship between the sources of funds and their use. It is expressed, for example, in the commodity balance formula:

He + P = V + Ok

The left side of the formula characterizes supply, and the right side characterizes the use of commodity resources. The important practical significance of the commodity balance formula is that in the absence of quantitative accounting for the sale of goods, the amount of retail sales of individual goods is determined on its basis.

Component relationships of commercial activity indicators are characterized by the fact that a change in a statistical indicator is determined by a change in the components included in this indicator as multipliers:

a = b x c

In business statistics, component relationships are used in the index method of identifying the role of individual factors in the overall measurement of a complex indicator.

Ipq= Ip x Iq

The practical significance of indicators consisting of a component relationship is that it allows you to determine the value of one of the unknown components.

Factor relationships are characterized by the fact that they manifest themselves in a consistent variation of the studied indicators. In this case, some indicators act as factor indicators, while others act as result indicators. In turn, factor connections can be considered as functional and correlational. With a functional connection, the change in the resulting characteristic (y) is entirely determined by the action of the factor characteristic (x):

In a correlation relationship, the change in the resulting characteristic (y) is due to the influence of the factor characteristic (x) not entirely, but only partially, since the influence of other factors (e) is possible:

By their nature, correlation connections are relative connections. Here, with the same taken into account value of the factor characteristic, different values ​​of the resultant characteristic are possible. This is due to the presence of other factors that may be different in composition, direction and strength of action on individual units of the statistical population. Therefore, for the statistical population being studied as a whole, a relationship is established here in which a certain change in the factor characteristic corresponds to the average change in the resultant characteristic. Consequently, a characteristic feature of correlations is that they appear not in isolated cases, but in large numbers. In the statistical study of correlations, the influence of the factor characteristics taken into account is determined, abstracting from other arguments. When studying correlations, the following tasks are set:

checking the provisions of economic theory on the possibility of a connection between the studied indicators and giving the identified connection an analytical form of dependence;

establishing quantitative estimates of the closeness of the connection, characterizing the strength of the influence of factor characteristics on the results.

If the relationship between two characteristics is studied, this is a pairwise correlation. If the relationship between many characteristics is studied, the correlation is multiple.

2. Construction of equations of simulated functions

The most developed methodology in statistics theory is the so-called pair correlation methodology. When studying the relationship between indicators, various types of linear and curvilinear relationship equations are used:

linear -

parabolic -

hyperbolic -

Determining the parameters of the regression equation begins with the fact of establishing the connection between the indicators under consideration. To do this, the pair correlation coefficient is calculated:

To draw conclusions about the practical significance of the resulting correlation coefficient, a qualitative assessment is given based on the Chaddock scale:

With values ​​of the closeness of connection indices exceeding 0.7, the dependence of the resultant characteristic on the factorial one is high, since the value of the coefficient of determination will always be more than 50%.

The coefficient of determination characterizes what proportion of the effective indicator explains the influence of the factor being studied:

Therefore, if the correlation coefficient exceeds 0.7 between the effective indicator and the factor under study, there is a relationship that explains the change in the effective indicator from the factor under consideration by more than 50%.

Example: analyze data on the average price for Parmesan cheese in the Donetsk region for a number of years:

Average salary, UAH.

Thus, there is a high dependence of the average monthly wage on the year, namely, 92% of wages are explained by changes in the year.

3. Assessing the adequacy and reliability of the equation

correlation regression commercial statistical

The parameters of the functions selected for modeling can be found in different ways. The most accurate method is the least squares method. On it, a special system of equations is formed for each of the functions:

linear -

parabolic -

hyperbolic -

In each of the systems:

Y - effective indicator;

X - time indicator;

N - number of observations;

A, b, c - model parameters.

The countdown of the time indicator starts from 1. Based on the known values ​​of x and y, all amounts are determined and substituted into the system. As a result, a system of equations is obtained for unknown parameters. When solving the system, specific digital values ​​of the parameters are found and they are substituted into the solution of modeling functions that must be evaluated and used in practice.

Example: let's calculate the auxiliary table:

Let's create systems of equations for three functions and find the values ​​of the parameters of the equations:

linear model: 1525 = 7a + 28b

7266 = 28a + 140b

a = -5.7 b = 53.04 y = -5.7+53.04x

parabolic model: 1525 = 7a + 28b + 140c

7266 = 28a + 140b + 784c

40248 = 140a + 784b + 4676c

a = 697.62 b = -114.08 c = 68.59 y = 697.62 - 114.08x + 68.59x2

hyperbolic model: 1525 = 7a + 2.59b

432.13 = 2.59a + 1.51b

a = 237.65 b = 53.49 y = 237.65 + 53.49/x

4. Estimation of equation parameters

The adequacy of an economic-mathematical model can be established using the average error of approximation (average percentage of discrepancy between theoretical and practical values):

where y1 are the actual values ​​of the performance indicator;

y0 - theoretical values ​​found from the equation.

When modeling economic indicators, a 5% error is most often allowed. The model is considered adequate, and therefore significant, if.

The selection of the most optimal model can be carried out on the basis of the residual standard deviation (residual variance):

where l is the number of parameters of the equation.

The best function will be the one with the smallest residual variance.

The reliability of the equation is assessed using the Fisher criterion, taking into account the F-statistics:

where is the average value of the effective indicator.

The larger the calculated F-test value, the more significant the calculated model. The calculated value is compared with the critical value, which is found in the Fisher distribution tables for degrees of freedom (l-1) and (n-l), setting the significance level to 0.05 (5% error). If F>F table, then the equation is considered reliable with a probability of 0.95. Otherwise, the equation is not considered reliable.

Calculation for linear function:

Approximation

(U0 - U0av)2

F-table - 230.2

for a parabolic function:

Approximation

(U0 - U0av)2

F-table - 19.25

for a hyperbolic function:

Approximation

(U0 - U0av)2

F-table - 230.2

Thus, none of the presented functions is sufficiently reliable and has no practical significance due to large discrepancies between the theoretical and actual values ​​of the effective indicator.

To characterize the economic content of the parameters of the equations, the most appropriate is to use elasticity coefficients, which characterize by what percentage on average the function will change with a change in the argument by 1% with a fixed value of the remaining factors at any level:

where Ei is the elasticity coefficient of the i-th factor;

Regression parameters of the i-th factor;

Average value of the i-th factor;

Average value of the effective indicator.

Posted on Allbest.ru

...

Similar documents

    Basic concepts of correlation and regression analysis. Calculation of indicators of the strength and closeness of connections between phenomena and processes, the specifics of their interpretation. Evaluating the results of linear regression analysis. Coefficient of multiple determination.

    test, added 04/02/2013

    Analysis of the essence of profit, its role in the activities of the enterprise, as well as the procedure for its calculation and analysis using statistical methods. The concept of profitability and the statistical study of its indicators. Application of sampling and method in financial and economic problems.

    course work, added 12/12/2012

    Statistical study of time series, types of indicators. Calculation of the closure coefficient. Chain and basic indicator. Average level of dynamic series. Determination of the general pattern in the development of the phenomenon. Statistical study of seasonal variations.

    lecture, added 04/27/2013

    Main features, objectives and prerequisites for the application of the correlation-regression method. Methods of correlation and regression analysis. Kendall, Spearman, Fechner rank correlation coefficient. Determining the closeness of the relationship between indicators.

    test, added 04/08/2013

    Statistical study and methods for calculating indicators of the volume of production of goods and services. Analysis of the dependence of the number of crimes on the number of unemployed in the central region of Russia using a package of application programs for processing spreadsheets.

    course work, added 03/19/2010

    Statistical study of labor productivity. Analysis of structural groupings. Types and tasks of groups, connections between them. Grouping technique. Sturgess formula. Statistics of capital-labor ratio, labor productivity and fixed assets.

    course work, added 01/15/2009

    Statistical study of the dynamics of insurance market indicators. Construction of a statistical series for the grouping of insurance organizations by the amount of cash income, calculation of the characteristics of the distribution series. Calculation of sampling error of average income.

    course work, added 01/03/2010

    Forms and systems of remuneration, the degree of their prevalence at the OJSC "OZSK" enterprise. Statistical study of the composition and structure of the enterprise's wage fund. Calculation and analysis of FZP dynamics, determining factors. Quantitative assessment of indicators.

    course work, added 08/11/2011

    Correlation and regression analysis as an object of statistical study, a system of statistical indicators that characterize it. Features and principles of application of the method of correlation-regression analysis. Construction of a statistical distribution series.

    course work, added 01/28/2014

    Forms and systems of remuneration and the degree of prevalence in the enterprise. Statistical study of the composition and structure of the enterprise's wage fund. Analysis and calculation of financial wage dynamics indicators. Quantitative assessment of the factors determining its dynamics.

Social phenomena, including legally significant ones, are interconnected, dependent on each other and condition each other. The existing relationships are realized in the form of causality, functional connection, connection of states, etc. A special role in the relationships of social phenomena belongs to causality, i.e., a particle of universal connection, but not subjective, but objectively real. This objectively necessary connection, in which one or more interrelated phenomena, called a cause (factor), gives rise to another phenomenon, called a consequence (result), and can be called causality.

Legal sciences specify this concept in relation to phenomena and processes of a legally significant nature. Among the legal disciplines, the furthest advanced in the study of causation is criminology - the science of crime, its causes and prevention, criminal law, where the establishment of a causal relationship between an action and a consequence is a necessary condition for the onset of criminal liability. But questions of causation are important in administrative, civil, and other branches of law.

There is not only commonality between causality in criminology and law, but also significant differences. The causal relationship between criminogenic factors and the commission of a crime (causes and crime) precedes in time the causal relationship between a socially dangerous action (inaction) and criminal consequences. The latter is characterized mainly by dynamic patterns and functional connections, and between criminogenic factors and criminal behavior there are mainly statistical patterns and correlations.

Any natural connection presupposes repeatability, consistency and order in phenomena, but the connections under consideration manifest themselves in different ways: functional ones - in each individual case, and correlational ones - in a large mass of phenomena. For example, there is a direct causal functional connection between a knife blow and bodily injury (unless, of course, the damage is complicated by infection of the wound, unqualified medical care, etc.). Functional dependence is characterized by the fact that a change in any one feature, which is a function, is associated with a change in another feature. This relationship is equally manifested in all units of any population.

If a blow with a knife causes a wound to the body (we abstract from the type of knife, the force of the blow, its location, the nature of the wound and other specific circumstances), then no matter who this blow is inflicted on, the relationship between him and the wound will manifest itself everywhere. Having installed it once, we use this dependency in all similar cases. Medical and forensic examinations are based on knowledge of this dependence. Attributing the relationship between a knife blow and injury to a functional connection is quite arbitrary. This form of dependence is not identical to the functional connection in physics or mathematics.

In the exact sciences, functional relationships are usually expressed by formulas. For example, in the formula S = kYa 2 area of ​​a circle S(resultant sign) is directly proportional to the square

its radius R(factorial sign). Formula I= - deciphered

it is more difficult to determine: the strength of the electric current (/) is directly proportional to the voltage (U) and inversely proportional to resistance (R). In this case, the resulting characteristic is determined by two factor characteristics with opposite effects. The higher the voltage or lower the resistance, the greater the current strength. The functional dynamic coupling is precisely calculated. Therefore it is both complete and accurate. It operates in all autonomous systems with a relatively small number of elements, little dependent on external influences.

Legal sciences deal mainly with socio-legal phenomena and processes, where there are no such rigid, unambiguously complete and accurate connections. The causality of crime, and especially crime, as a mass social phenomenon, is associated with a huge set of interdependent circumstances, which, with a change in the action of at least one of them, can change the nature of the entire interaction as a whole. The number of circumstances that influence the commission of crimes reaches 450 or more.

The causal relationship between each sign-factor and sign-consequence is characterized by ambiguity: one or another sign-consequence changes under the influence of a complex of sign-factors, and each value of the sign-factor corresponds (under the influence of other sign-factors) to several values ​​of the sign-effect. Therefore, the connection between the cause (set of causes) and the effect (crime or delinquency) is multi-valued and has a probabilistic nature.

The ambiguity lies not only in the fact that each offense (and delinquency in general) is the result of many causes, but also in the fact that each cause, interacting with one or another set of other causes, can give rise to not one, but several consequences, in including various types of illegal and lawful behavior.

The probabilistic side of the ambiguity of causation in criminology and sociology of law “consists in the fact that when replacing any condition, even for the same reason, a different result is obtained.” This form of causal relationship, in which the cause does not determine the effect unambiguously, but only with a certain degree of probability, is incomplete and is called a correlation relationship. It reflects a statistical pattern and operates in all non-autonomous systems that depend on constantly changing external conditions and have a very large number of elements (factors).

The causes of crime, for example, are “dissolved” in the total mass of positive influences, “distributed” in the structure of a person’s activity and “stretched” throughout his entire life. Therefore, the effect of one reason or another can be detected only in a very large mass of cases. But even at the mass statistical level, where the influence of random factors is somehow neutralized through mutual destruction, the discovered dependencies cannot be complete and accurate, i.e. functional. The effect of unaccounted for, unknown, and often known, but difficult to detect factors is manifested in the fact that the studied connections turn out to be not only incomplete, but also approximate.

It is reasonably believed that raising a child without one or both parents is a criminogenic factor. Does this mean that every person brought up in such conditions will commit a crime in the future? No way. Behind the generalized factor - upbringing without parents - there can be a huge number of other factors, criminogenic and anti-criminogenic, which are different for each child. But when studying a large mass of people raised by parents and without parents, in all countries of the world, a statistical deviation is established with regularity: people raised without one or both parents commit crimes much more often than those raised in a two-parent family.

Between criminogenic factors and crime there is direct correlation(with a “+” sign). For example, the higher the level of alcoholism in a society, the higher the crime, and specific (“drunken”) crime. Between anti-criminogenic factors and crime there is a inverse correlation(with a “-” sign). For example, the higher the social control in a society, the lower the crime rate. Both forward and backward connections can be linear or curvilinear.

Straight-line (linear) relationships appear when, with an increase in the values ​​of the factor trait, there is an increase (direct) or decrease (inverse) in the value of the consequence trait. Mathematically, this relationship is expressed by a straight line equation (regression equation):

Where at - sign-consequence; a and b - corresponding coupling coefficients; x - sign-factor.

We have already addressed this formula when aligning a time series along a straight line.

Curvilinear connections are of a different nature. An increase in the value of a factor characteristic has an uneven impact on the value of the resulting characteristic. At first this connection can be direct, and then reverse. In legal science, such connections have hardly been studied, but they exist. A well-known example is the connection between crimes and the age of the offenders. Initially, the criminal activity of individuals increases in direct proportion to the increase in the age of the offenders (up to approximately 30 years), and then, with increasing age, criminal activity decreases. Moreover, the top of the distribution curve of offenders by age is shifted from the average to the left (towards a younger age) and is asymmetrical.

A more complex example: with the expansion of social control, the level of illegal behavior decreases, but further totalization of control turns it from an anti-criminogenic factor into a criminogenic one. Therefore, “tightening the screws” in society is socially useful only to a certain extent. Such connections are statistically described by equations of curved lines (hyperbolas, parabolas, etc.).

Correlation linear connections can be one-factor, when the connection between one factor-sign and one consequence-sign is studied (pair correlation). They can be multifactorial, when the influence of many interacting signs-factors on a sign-consequence (multiple correlation) is studied.

Pairwise correlation has long been used in legal statistics, and multiple correlation practically not used, although multifactorial connections can be said to dominate in criminology, tortology and sociology of law. This is due to a number of difficulties: unorganized consideration of factors, insufficient mathematical, statistical and sociological training of lawyers and other objective circumstances.

Correlations between some phenomena and others are visible already at the first stages of statistical data processing. The summary and grouping of statistical indicators, the calculation of relative and average values, the construction of variational, dynamic, parallel series makes it possible to establish the existence of a relationship between the phenomena being studied and even its nature (direct and inverse). If, having constructed a variation series of criminals by age, we find that the main frequencies are grouped in the interval of youth, we have sufficient grounds to believe that youth is the most criminogenic age. Although age (as we established in previous chapters) does not act in its own meaning, but only as an integrated exponent of criminogenic conditions that interact with the corresponding age-related changes in a person.

Let us turn to the state of intoxication, which is considered a criminogenic factor in all countries of the world and is therefore statistically monitored. In Russia in 1996, it was recorded that 39% of all recorded crimes were committed while offenders were intoxicated, including 77.6% of rapes, 73.5% of intentional murders, 69.8% of hooliganism, 59.7% - robberies, 57.0% - robberies, 37.7% - thefts and 0% - bribery. The given percentages indicate a direct correlation between crimes and drunkenness (except for bribery). Since these figures are repeated almost year after year, they indicate not only the existence of this connection, but to a certain extent also the degree of influence of drunkenness on various types of acts. To more accurately measure relationships, statistics has a large range of different methods.

  • See: Kudryavtsev V.N. Causality in criminology. M., 1968; Tsereteli T.V. Causality in criminal law. M, 1963.
  • See: Model of regional criminological and criminal legal forecast. M., 1994.
  • Kudryavtsev V.N. Causality in criminology. P. 9.
  • See: Luneev V.V. Crime of the 20th century. World, regional and Russian trends. pp. 775-840.
2024 About comfort in the home. Gas meters. Heating system. Water supply. Ventilation system