算法题:两个大小不同的已排序数组的中位数

2021年4月13日10:14:45 发表评论 1,609 次浏览

本文概述

给定两个排序的数组a []和b [], 任务是在O(log n + log m)时间复杂度下(当n是第一个数组中的元素数时)找到这些排序的数组的中位数。 m是第二个数组中的元素数。

这是两个大小相等的排序数组的中值问题的扩展。这里我们也处理大小不等的数组。

例子:

Input: ar1[] = {-5, 3, 6, 12, 15}
        ar2[] = {-12, -10, -6, -3, 4, 10}
Output : The median is 3.
Explanation : The merged array is :
        ar3[] = {-12, -10, -6, -5 , -3, 3, 4, 6, 10, 12, 15}, So the median of the merged array is 3

Input: ar1[] = {2, 3, 5, 8}
        ar2[] = {10, 12, 14, 16, 18, 20}
Output : The median is 11.
Explanation : The merged array is :
        ar3[] = {2, 3, 5, 8, 10, 12, 14, 16, 18, 20}
        if the number of the elements are even, so there are two middle elements, take the average between the two :
        (10 + 12) /2 = 11.

方法1:此方法使用线性且更简单的方法。

方法:给定的数组是有序的,因此以一种有效的方式合并已排序的数组,并保留在输出数组或打印形式中插入的元素的计数。因此,当输出数组中的元素是给定数组原始大小的一半时,将该元素打印为中值元素。有两种情况::

  1. 情况1:m + n为奇数, 中位数为合并两个数组后获得的数组中第(m + n)/ 2个索引。
  2. 情况2:m + n是偶数, 中位数是合并两个数组后获得的数组中索引((m + n)/ 2 – 1)和(m + n)/ 2元素的平均值

算法

  1. 给定两个数组进行排序。因此它们可以在O(m + n)时间内合并。创建一个变量计数, 使输出数组中的元素计数。
  2. 如果(m + n)的值是奇数, 则只有一个中位数, 否则中位数是索引为(m + n)/ 2和((m + n)/ 2-1)的元素的平均值。
  3. 要合并两个数组, 请将两个索引i和j初始分配为0。比较第一个数组的ith索引和第二个数组的jth索引, 增加最小元素的索引并增加计数。
  4. 检查计数是否达到(m + n)/ 2(m + n)是否为奇数并存储元素, 是否甚至存储(m + n)/ 2 th和(m + n)/ 2 -1 th的平均值元素并打印。

实现

C ++

//A Simple Merge based O(n) solution to find 
//median of two sorted arrays 
#include <bits/stdc++.h>
using namespace std;
  
/* This function returns median of ar1[] and ar2[]. 
Assumption in this function: 
Both ar1[] and ar2[] are sorted arrays */
int getMedian( int ar1[], int ar2[], int n, int m) 
{ 
     int i = 0; /* Current index of input array ar1[] */
     int j = 0; /* Current index of input array ar2[] */
     int count; 
     int m1 = -1, m2 = -1; 
  
     //Since there are (n+m) elements, //There are following two cases 
     //if n+m is odd then the middle 
     //index is median i.e. (m+n)/2 
     if ((m + n) % 2 == 1) 
     { 
         for (count = 0; count <= (n + m)/2; count++)
         { 
             if (i != n && j != m)
             { 
                 m1 = (ar1[i]> ar2[j]) ? ar2[j++] : ar1[i++]; 
             } 
             else if (i <n)
             { 
                 m1 = ar1[i++]; 
             } 
             //for case when j<m, else
             { 
                 m1 = ar2[j++]; 
             } 
         } 
         return m1; 
     } 
      
     //median will be average of elements 
     //at index ((m+n)/2 - 1) and (m+n)/2 
     //in the array obtained after merging ar1 and ar2 
     else 
     { 
         for (count = 0; count <= (n + m)/2; count++) 
         { 
             m2 = m1; 
             if (i != n && j != m)
             { 
                 m1 = (ar1[i]> ar2[j]) ? ar2[j++] : ar1[i++]; 
             } 
             else if (i <n)
             { 
                 m1 = ar1[i++]; 
             } 
             //for case when j<m, else
             { 
                 m1 = ar2[j++]; 
             } 
         } 
         return (m1 + m2)/2; 
     } 
} 
  
/* Driver code */
int main() 
{ 
     int ar1[] = {900}; 
     int ar2[] = {5, 8, 10, 20}; 
  
     int n1 = sizeof (ar1)/sizeof (ar1[0]); 
     int n2 = sizeof (ar2)/sizeof (ar2[0]); 
     cout <<getMedian(ar1, ar2, n1, n2); 
     return 0; 
} 
  
//This is code is contributed by rathbhupendra

C

//A Simple Merge based O(n) solution to find 
//median of two sorted arrays 
#include <stdio.h> 
  
/* This function returns median of ar1[] and ar2[]. 
Assumption in this function: 
Both ar1[] and ar2[] are sorted arrays */
int getMedian( int ar1[], int ar2[], int n, int m) 
{ 
     int i = 0; /* Current index of input array ar1[] */
     int j = 0; /* Current index of input array ar2[] */
     int count; 
     int m1 = -1, m2 = -1; 
  
     //Since there are (n+m) elements, //There are following two cases
     //if n+m is odd then the middle 
     //index is median i.e. (m+n)/2
     if ((m + n) % 2 == 1) {
         for (count = 0; count <= (n + m)/2; count++) {
             if (i != n && j != m){
             m1 = (ar1[i]> ar2[j]) ? ar2[j++] : ar1[i++];
             }
             else if (i <n){
             m1 = ar1[i++];
             }
             //for case when j<m, else {
             m1 = ar2[j++];
             }
         }
         return m1;
     }
      
     //median will be average of elements 
     //at index ((m+n)/2 - 1) and (m+n)/2
     //in the array obtained after merging ar1 and ar2
     else {
         for (count = 0; count <= (n + m)/2; count++) {
             m2 = m1;
             if (i != n && j != m){
             m1 = (ar1[i]> ar2[j]) ? ar2[j++] : ar1[i++];
             }
             else if (i <n){
             m1 = ar1[i++];
             }
             //for case when j<m, else {
             m1 = ar1[j++];
             }
         }
         return (m1 + m2)/2;
     }
}
  
/* Driver program to test above function */
int main() 
{ 
     int ar1[] = {900}; 
     int ar2[] = {5, 8, 10, 20}; 
  
     int n1 = sizeof (ar1)/sizeof (ar1[0]); 
     int n2 = sizeof (ar2)/sizeof (ar2[0]); 
     printf ( "%d" , getMedian(ar1, ar2, n1, n2)); 
     getchar (); 
     return 0; 
} 
//This code is uploaded by Pratil

输出如下:

10

复杂度分析:

  • 时间复杂度:O(m + n)。
    要合并两个数组, 需要O(m + n)时间。
  • 空间复杂度:O(1)。
    不需要额外的空间。

高效的解决方案:

方法:

这个想法很简单, 计算两个数组的中位数, 并丢弃每个数组的一半。

现在, 有一些基本的极端情况。对于数组大小小于或等于2

假设有两个数组, 并且两个数组的大小都大于2。如果较小数组的中间元素, 则找到第一个数组的中间元素和第二个数组的中间元素(第一个数组小于第二个数组)小于第二个数组, 那么可以说较小数组的前半部分的所有元素都将位于输出(合并数组)的前半部分。因此, 通过忽略较小数组的前半部分和较大数组的后半部分来减少搜索空间。否则忽略较小数组的后半部分和较大数组的前半部分。

除此之外, 还有其他一些基本情况:

  1. 如果较小数组的大小为0。则返回较大数组的中位数。
  2. 如果较小数组的大小为1。
    1. 较大数组的大小也为1。返回两个元素的中值。
    2. 如果较大数组的大小为奇数。然后, 将第二个数组中的元素相加后, 它将是偶数, 因此中位数将是两个中间元素的平均值。因此, 较小数组中的元素只有在较大数组中的第(m / 2 – 1)个元素与第(m / 2 + 1)个元素之间时, 才会影响中位数。因此, 找到较小数组的元素与较大数组的第(m / 2), 第(m / 2 – 1)和(m / 2 + 1)个元素的四个元素之间的中位数
    3. 同样, 如果大小为偶数, 则检查三个元素的中位数:较小数组的元素和较大数组的第(m / 2), 第(m / 2 – 1)个元素
  3. 如果较小数组的大小为2
    1. 如果较大的数组也有两个元素, 则求四个元素的中位数。
    2. 如果较大的数组具有奇数个元素, 则中位数将是以下3个元素之一
      1. 大数组的中间元素
      2. 较小数组的第一个元素的最大值和正好在中间的元素的最大值, 即较大数组中的M / 2-1个元素
      3. 较小数组和元素的第二个元素的最小值
        在较大数组的中间之后, 即较大数组中的M / 2 + 1个元素
    3. 如果较大的数组具有偶数个元素, 则中位数将是以下4个元素之一
      1. 较大数组的中间两个元素
      2. 较小数组的第一个元素的最大值以及紧靠较大数组的第一个中间元素之前的元素的最大值, 即M / 2 – 2nd元素
      3. 较小数组的第二个元素的最小值和较大数组中的第二个中间元素之后的元素的最小值, M / 2 + 1个元素

每个数组的一半如何丢弃?

让我们以一个例子来理解此输入:arr [] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, brr [] = {11, 12, 13, 14, 15, 16, 17, 18, 19}代码的空运行:递归调用1:较小的array [] = 1 2 3 4 5 6 7 8 9 10, 中= 5个较大的array [] = 11 12 13 14 15 16 17 18 19, mid = 15 5 <15丢弃第一个数组的前半部分和第二个数组的后半部分递归调用2:较小的array [] = 11 12 13 14 15, 中间= 13较大的array [] = 5 6 7 8 9 10, mid = 7 7 <13丢弃第二个数组的前半部分和第一个数组的后半部分递归调用3:较小的array [] = 11 12 13, mid = 12较大的array [] = 7 8 9 10, mid = 8 8 <12丢弃第二个数组的前半部分和第一个数组的后半部分递归调用4:较小的array [] = 11 12较大的array [] = 8 9 10较小的数组的大小为2, 较大的数组的大小为array是奇数, 因此, 中位数将是max(11, 8), 9, min(10, 12)的中位数, 即9、10、11, 因此中位数是10。输出:10.000000

算法:

  1. 创建一个递归函数, 该函数需要两个数组以及两个数组的大小。
  2. 请注意小于2的数组大小的基本情况。(先前在"方法"中已讨论过)。注意:第一个数组始终是较小的数组。
  3. 找到两个数组的中间元素。即分别位于第一和第二数组的(n – 1)/ 2和(m – 1)/ 2处的元素。比较两个元素。
  4. 如果较小数组的中间元素小于较大数组的中间元素, 则较小数组的前半部分必须严格位于合并数组的前半部分。还可以说, 较大数组的前半部分和较小数组的后半部分中有一个元素是中位数。因此, 将搜索空间减小到较大数组的前半部分和较小数组的后半部分。
  5. 同样, 如果较小数组的中间元素大于较大数组的中间元素, 则将搜索空间减小到较小数组的前一半和较大数组的后一半。

实现

C ++

//A C++ program to find median of two sorted arrays of
//unequal sizes
#include <bits/stdc++.h>
using namespace std;
  
//A utility function to find median of two integers
float MO2( int a, int b)
{ return ( a + b ) /2.0; }
  
//A utility function to find median of three integers
float MO3( int a, int b, int c)
{
     return a + b + c - max(a, max(b, c))
                      - min(a, min(b, c));
}
  
//A utility function to find a median of four integers
float MO4( int a, int b, int c, int d)
{
     int Max = max( a, max( b, max( c, d ) ) );
     int Min = min( a, min( b, min( c, d ) ) );
     return ( a + b + c + d - Max - Min ) /2.0;
}
  
//Utility function to find median of single array
float medianSingle( int arr[], int n)
{
    if (n == 0)
       return -1;
    if (n%2 == 0)
         return ( double )(arr[n/2] + arr[n/2-1])/2;
    return arr[n/2];
}
  
//This function assumes that N is smaller than or equal to M
//This function returns -1 if both arrays are empty
float findMedianUtil( int A[], int N, int B[], int M )
{
     //If smaller array is empty, return median from second array
     if (N == 0)
       return medianSingle(B, M);
  
     //If the smaller array has only one element
     if (N == 1)
     {
         //Case 1: If the larger array also has one element, //simply call MO2()
         if (M == 1)
             return MO2(A[0], B[0]);
  
         //Case 2: If the larger array has odd number of elements, //then consider the middle 3 elements of larger array and
         //the only element of smaller array. Take few examples
         //like following
         //A = {9}, B[] = {5, 8, 10, 20, 30} and
         //A[] = {1}, B[] = {5, 8, 10, 20, 30}
         if (M & 1)
             return MO2( B[M/2], MO3(A[0], B[M/2 - 1], B[M/2 + 1]) );
  
         //Case 3: If the larger array has even number of element, //then median will be one of the following 3 elements
         //... The middle two elements of larger array
         //... The only element of smaller array
         return MO3( B[M/2], B[M/2 - 1], A[0] );
     }
  
     //If the smaller array has two elements
     else if (N == 2)
     {
         //Case 4: If the larger array also has two elements, //simply call MO4()
         if (M == 2)
             return MO4(A[0], A[1], B[0], B[1]);
  
         //Case 5: If the larger array has odd number of elements, //then median will be one of the following 3 elements
         //1. Middle element of larger array
         //2. Max of first element of smaller array and element
         //just before the middle in bigger array
         //3. Min of second element of smaller array and element
         //just after the middle in bigger array
         if (M & 1)
             return MO3 ( B[M/2], max(A[0], B[M/2 - 1]), min(A[1], B[M/2 + 1])
                        );
  
         //Case 6: If the larger array has even number of elements, //then median will be one of the following 4 elements
         //1) & 2) The middle two elements of larger array
         //3) Max of first element of smaller array and element
         //just before the first middle element in bigger array
         //4. Min of second element of smaller array and element
         //just after the second middle in bigger array
         return MO4 ( B[M/2], B[M/2 - 1], max( A[0], B[M/2 - 2] ), min( A[1], B[M/2 + 1] )
                    );
     }
  
     int idxA = ( N - 1 ) /2;
     int idxB = ( M - 1 ) /2;
  
      /* if A[idxA] <= B[idxB], then median must exist in
         A[idxA....] and B[....idxB] */
     if (A[idxA] <= B[idxB] )
       return findMedianUtil(A + idxA, N/2 + 1, B, M - idxA );
  
     /* if A[idxA]> B[idxB], then median must exist in
        A[...idxA] and B[idxB....] */
     return findMedianUtil(A, N/2 + 1, B + idxA, M - idxA );
}
  
//A wrapper function around findMedianUtil(). This function
//makes sure that smaller array is passed as first argument
//to findMedianUtil
float findMedian( int A[], int N, int B[], int M )
{
     if (N> M)
        return findMedianUtil( B, M, A, N );
  
     return findMedianUtil( A, N, B, M );
}
  
//Driver program to test above functions
int main()
{
     int A[] = {900};
     int B[] = {5, 8, 10, 20};
  
     int N = sizeof (A) /sizeof (A[0]);
     int M = sizeof (B) /sizeof (B[0]);
  
     printf ( "%f" , findMedian( A, N, B, M ) );
     return 0;
}

的PHP

<?php
//A PHP program to find median 
//of two sorted arrays of 
//unequal sizes
  
//A utility function to
//find median of two integers
function MO2( $a , $b )
{ 
     return ( $a + $b ) /2.0;
}
  
//A utility function to 
//find median of three integers
function MO3( $a , $b , $c )
{
     return $a + $b + $c - 
        max( $a , max( $b , $c )) -
        min( $a , min( $b , $c ));
}
  
//A utility function to find
//median of four integers
function MO4( $a , $b , $c , $d )
{
     $Max = max( $a , max( $b , max( $c , $d )));
     $Min = min( $a , min( $b , min( $c , $d )));
     return ( $a + $b + $c + $d - $Max - $Min ) /2.0;
}
  
//Utility function to
//find median of single array
function medianSingle( $arr , $n )
{
if ( $n == 0)
     return -1;
if ( $n % 2 == 0)
         return ( $arr [ $n /2] + 
                 $arr [ $n /2 - 1]) /2;
return $arr [ $n /2];
}
  
//This function assumes that N 
//is smaller than or equal to M
//This function returns -1 if
//both arrays are empty
function findMedianUtil(& $A , $N , & $B , $M )
{
     //If smaller array is empty, //return median from second array
     if ( $N == 0)
     return medianSingle( $B , $M );
  
     //If the smaller array 
     //has only one element
     if ( $N == 1)
     {
         //Case 1: If the larger
         //array also has one 
         //element, simply call MO2()
         if ( $M == 1)
             return MO2( $A [0], $B [0]);
  
         //Case 2: If the larger array 
         //has odd number of elements, //then consider the middle 3
         //elements of larger array and
         //the only element of smaller
         //array. Take few examples
         //like following
         //$A = array(9), //$B = array(5, 8, 10, 20, 30) 
         //and $A = array(1), //$B = array(5, 8, 10, 20, 30)
         if ( $M & 1)
             return MO2( $B [ $M /2], $MO3 ( $A [0], $B [ $M /2 - 1], $B [ $M /2 + 1]));
  
         //Case 3: If the larger array 
         //has even number of element, //then median will be one of 
         //the following 3 elements
         //... The middle two elements
         // of larger array
         //... The only element of 
         // smaller array
         return MO3( $B [ $M /2], $B [ $M /2 - 1], $A [0]);
     }
  
     //If the smaller array
     //has two elements
     else if ( $N == 2)
     {
         //Case 4: If the larger 
         //array also has two elements, //simply call MO4()
         if ( $M == 2)
             return MO4( $A [0], $A [1], $B [0], $B [1]);
  
         //Case 5: If the larger array
         //has odd number of elements, //then median will be one of 
         //the following 3 elements
         //1. Middle element of 
         //larger array
         //2. Max of first element of 
         //smaller array and element
         //just before the middle 
         //in bigger array
         //3. Min of second element 
         //of smaller array and element
         //just after the middle 
         //in bigger array
         if ( $M & 1)
             return MO3 ( $B [ $M /2], max( $A [0], $B [ $M /2 - 1]), min( $A [1], $B [ $M /2 + 1]));
  
         //Case 6: If the larger array 
         //has even number of elements, //then median will be one of 
         //the following 4 elements
         //1) & 2) The middle two 
         //elements of larger array
         //3) Max of first element of 
         //smaller array and element
         //just before the first middle
         //element in bigger array
         //4. Min of second element of 
         //smaller array and element
         //just after the second
         //middle in bigger array
         return MO4 ( $B [ $M /2], $B [ $M /2 - 1], max( $A [0], $B [ $M /2 - 2]), min( $A [1], $B [ $M /2 + 1]));
     }
  
     $idxA = ( $N - 1 ) /2;
     $idxB = ( $M - 1 ) /2;
  
     /* if $A[$idxA] <= $B[$idxB], then
         median must exist in
         $A[$idxA....] and $B[....$idxB] */
     if ( $A [ $idxA ] <= $B [ $idxB ] )
     return findMedianUtil( $A + $idxA , $N /2 + 1, $B , $M - $idxA );
  
     /* if $A[$idxA]> $B[$idxB], then median must exist in
     $A[...$idxA] and $B[$idxB....] */
     return findMedianUtil( $A , $N /2 + 1, $B + $idxA , $M - $idxA );
}
  
//A wrapper function around
//findMedianUtil(). This 
//function makes sure that 
//smaller array is passed as 
//first argument to findMedianUtil
function findMedian(& $A , $N , & $B , $M )
{
     if ( $N> $M )
     return findMedianUtil( $B , $M , $A , $N );
  
     return findMedianUtil( $A , $N , $B , $M );
}
  
//Driver Code
$A = array (900);
$B = array (5, 8, 10, 20);
  
$N = sizeof( $A );
$M = sizeof( $B );
  
echo findMedian( $A , $N , $B , $M );
  
//This code is contributed
//by ChitraNayal
?>

输出如下:

10

复杂度分析:

  • 时间复杂度:O(min(log m, log n))。
    在每个步骤中, 将丢弃每个阵列的一半。因此, 该算法需要O(min(log m, log n))时间才能达到中值。
  • 空间复杂度:O(1)。
    不需要额外的空间。

如果发现任何不正确的地方, 或者想分享有关上述主题的更多信息, 请写评论。

木子山

发表评论

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: