Copying MultiIndex dataframes with pd.read_clipboard?

Question

          C
A   B      
1.1 111  20
    222  31
3.3 222  24
    333  65
5.5 333  22
6.6 777  74

How do I read it in using pd.read_clipboard? I've tried this:

df = pd.read_clipboard(index_col=[0, 1])

But it throws an error:

ParserError: Error tokenizing data. C error: Expected 2 fields in line 3, saw 3

How can I fix this?

hhahahhhahhhaaahahhaahaaaaa..... you don't! Seriously, I wish you could. Feel free to write the parser and contribute to the pandas project. I'd appreciate it if you did (-: — piRSquared, Aug 17 '17 at 16:33
Wow... really? :( Like Scott, I always move the column down and then set the index manually. I thought I was the only guy doing something that redundant... huh. — cs95, Aug 17 '17 at 16:35
If you can collect all those cases to a notebook , I will much appreciate:) — BENY, Aug 17 '17 at 16:55
I guess I am lazier, I sometimes simply avoid answering the questions when I see a multi-index df:) — Vaishali, Aug 17 '17 at 16:57
I always assumed if it was possible, Andy Hayden would [know](https://stackoverflow.com/a/20159305/2285236) about it so I also use the same trick. — ayhan, Aug 17 '17 at 16:57
Could you **also** provide the dataframe which was used to produce the output as copy&pastable code in the question? It would help me (totally ignorant of multiindex) to compare solutions to something :) — MSeifert, Aug 17 '17 at 16:59
@MSeifert Sure, sorry about that. Link added. For reference, here it is as well: https://stackoverflow.com/questions/17921010/how-to-query-multiindex-index-columns-values-in-pandas — cs95, Aug 17 '17 at 17:00
@CoryMadden I tried it, and it results in `NaN`s in the wrong place. — cs95, Aug 17 '17 at 17:11
`df.to_clipboard()` actually does exactly the same thing @ScottBoston does. — Cory Madden, Aug 17 '17 at 17:31
Guys, could you please try to test [this function](https://stackoverflow.com/a/45741989/5741205)? Thank you! — MaxU - stand with Ukraine, Aug 17 '17 at 17:55

MaxU - stand with Ukraine · Accepted Answer · 2017-08-20T19:53:03.803

UPDATE: now it parses the clipboard - i.e. no need to save it beforehand

def read_clipboard_mi(index_names_row=None, **kwargs):
    encoding = kwargs.pop('encoding', 'utf-8')

    # only utf-8 is valid for passed value because that's what clipboard
    # supports
    if encoding is not None and encoding.lower().replace('-', '') != 'utf8':
        raise NotImplementedError(
            'reading from clipboard only supports utf-8 encoding')

    from pandas import compat, read_fwf
    from pandas.io.clipboard import clipboard_get
    from pandas.io.common import StringIO
    data = clipboard_get()

    # try to decode (if needed on PY3)
    # Strange. linux py33 doesn't complain, win py33 does
    if compat.PY3:
        try:
            text = compat.bytes_to_str(
                text, encoding=(kwargs.get('encoding') or
                                get_option('display.encoding'))
            )
        except:
            pass

    index_names = None
    if index_names_row:
        if isinstance(index_names_row, int):
            index_names = data.splitlines()[index_names_row].split()
            skiprows = [index_names_row]
            kwargs.update({'skiprows': skiprows})
        else:
            raise Exception('[index_names_row] must be of [int] data type')

    df = read_fwf(StringIO(data), **kwargs)
    unnamed_cols = df.columns[df.columns.str.contains(r'Unnamed:')].tolist()

    if index_names:
        idx_cols = df.columns[range(len(index_names))].tolist()
    elif unnamed_cols:
        idx_cols = df.columns[range(len(unnamed_cols))].tolist()
        index_names = [None] * len(idx_cols)

    df[idx_cols] = df[idx_cols].ffill()
    df = df.set_index(idx_cols).rename_axis(index_names)

    return df

testing multi-index DF without index names:

In [231]: read_clipboard_mi()
Out[231]:
          C
1.1 111  20
    222  31
3.3 222  24
    333  65
5.5 333  22
6.6 777  74

testing multi-index DF with index names:

In [232]: read_clipboard_mi(index_names_row=1)
Out[232]:
          C
A   B
1.1 111  20
    222  31
3.3 222  24
    333  65
5.5 333  22
6.6 777  74

NOTE:

it's not well tested
it does NOT support multi-level columns
see point 1 ;-)

NOTE2: please feel free to use this code or to create a pull request on Pandas github

Wow, looks amazing! You should consider making a PR to the pandas dev. — cs95, Aug 17 '17 at 17:57
Thank you guys! I don't think i have enough time and patience to write all necessarry tests... — MaxU - stand with Ukraine, Aug 17 '17 at 18:01
Well tests will come naturally since I plan on using it regularly. :) — ayhan, Aug 17 '17 at 18:18
will using this function instead of read_clipboard, and try to test it or even add more feature — BENY, Aug 17 '17 at 18:28
Yes, I will award a bounty for this. How much do you want? 50? 100? 200? This answer is certainly worth it. — cs95, Aug 18 '17 at 00:08
@cᴏʟᴅsᴘᴇᴇᴅ, glad it helps! It's not necessary, really ;-) — MaxU - stand with Ukraine, Aug 18 '17 at 07:50
@MaxU, are you going to do a pull request to get this feature added to Pandas? — Scott Boston, Aug 20 '17 at 18:58
@ScottBoston, no, it doesn't cover many cases like multilevel columns, etc. Beside that one would have to write additional tests... — MaxU - stand with Ukraine, Aug 20 '17 at 19:05
@ScottBoston, i've added a comment to the answer - maybe someone wil will have enough time and patience to create a PR... — MaxU - stand with Ukraine, Aug 20 '17 at 19:54

Copying MultiIndex dataframes with pd.read_clipboard?

1 Answers1

Linked